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PREFACE 


The document contains the proceedings of the training workshop on Emerging 
and Future Computing Paradigms and their impact on the Research, Training and Design 
Environments of the Aerospace Workforce. The workshop was held at NASA Langley 
Research Center, Hampton, Virginia, March 18 and 19, 2003. The workshop was jointly 
sponsored by Old Dominion University and NASA. Workshop attendees came from 
NASA, other government agencies, industry and universities. The objectives of the 
workshop were to a) provide broad overviews of the diverse activities related to new 
computing paradigms, including grid computing, pervasive computing, high-productivity 
computing, and the IBM-led autonomic computing; and b) identify future directions for 
research that have high potential for future aerospace workforce environments. The 
format of the workshop included twenty-one, half-hour overview-type presentations and 
three exhibits by vendors. 
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INTRODUCTION 


The accelerating pace of the computing technology development shows no signs 
of abating. Computing power reaching 100 Tflop/s is likely to be reached by 2004 and 
Pflop/s (10 15 Flop/s) by 2007. The fundamental physical limits of computation, including 
information storage limits, communication limits and computation rate limits will likely 
be reached by the middle of the present millennium. To overcome these limits, novel 
technologies and new computing paradigms will be developed. 

An attempt is made in this overview to put the diverse activities related to new 
computing-paradigms in perspective and to set the stage for the succeeding presentations. 
The presentation is divided into five parts (Figure 1). In the first part, a brief historical 
account is given of development of computer and networking technologies. The second 
part provides brief overviews of the three emerging computing paradigms - grid, 
ubiquitous and autonomic computing. The third part lists future computing alternatives 
and the characteristics of future computing environment. The fourth part describes future 
aerospace workforce research, learning and design environments. The fifth part lists the 
objectives of the workshop and some of the sources of information on future computing 
paradigms. 
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BRIEF HISTORICAL ACCOUNT OF THE DEVELOPMENT OF COMPUTER 
AND NETWORKING TECHNOLOGIES 


The field of computing is less than sixty years old. The first electronic computers 
were built in the 1940s as part of the war effort. The first transistor was invented in 1947. 
By 1950s, IBM and Univac built business computers, intended for scientific and 
mathematical calculations to determine ballistic trajectories and break ciphers. Soon 
other companies joined the effort - names like RCA, Burroughs, ICL and General 
Electric - most of whom disappeared or left the computer business. The first 
programming languages - Algol, FORTRAN, Cobol, and Lisp - were designed in the late 
1950s, and the first operating system in the early 1960s. The first computer chip 
appeared in the late 1970s, the personal computer around the same time, and the IBM PC 
in 1981. Ethernet was invented in 1973 and did not appear in the market until 1980. It 
operated at 10 megabits per second (10 Mb/s) and increased to 1 Gb/s (10 9 bits/s) in 
1997. The internet descended from the ARPANET in 1970s, and the World Wide Web 
was created in 1989 (see Figure 2). 
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EVOLUTION OF MICROPROCESSORS 


Although the first computers used relays and vacuum tubes for the switching 
elements, the age of digital electronics is usually said to have begun in 1947, when a 
research team at Bell Laboratories designed the first transistor. The transistor soon 
displaced the vacuum tube as the basic switching element in digital design. The nerve 
center for a computer, or a computing device, is its integrated circuit (IC or chip), the 
small electronic device made out of a semiconductor material. Integrated circuits, which 
appeared in the mid- 1 960 ’s and allowed mass fabrication of transistors on silicon 
substrates are often classified by the number of transistors and other electronic 
components they contain. The ever-increasing number of devices packaged on a chip has 
given rise to the acronyms SSI, MSI, LSI, VLSI, ULSI, and GSI, which stand for small 
scale (1960s - with up to 20 gates per chip), medium-scale (late 1960’s - 20-200 gates), 
large-scale (1970s - 200-5000 gates per chip), very large-scale (1980s - over 5000 gates 
per chip), ultra large-scale (1990s - over million transistors per chip), and giga-scale 
integration (over billion transistors per chip), respectively (Figure 3). 

In 1965, Gordon Moore hypothesized that processing power (number of 
transistors and computing speed) of computer chips was doubling every 18-24 months or 
so. For nearly four decades the chip industry has marched in lock step to this pattern or 
rule of thumb, which is referred to as Moore’s law (see Figure 3). 
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GROWTH IN COMPUTER SPEED AND SHIFT IN HARDWARE 

TECHNOLOGY 


Advances in microprocessor technology resulted in increasing the speed of 
computers by more than trillion times during the last five decades, while dramatically 
reducing the cost (Figure 4). 

A number of technologies have been used to achieve ultra fast logic circuits. 
These include use of: new material systems such as gallium arsenide (Ga As); multichip 
modules (MCM); monolithic and hybrid wafer-scale integration (WSI); new transistor 
structures such as the quantum-coupled devices using hetero-junction-based super 
lattices; and optical interconnections and integrated optical circuits. More recently, the 
use of carbon nanotubes as transistors in chips; clockless (asynchronous) chips and; 
hyper-threading, which makes a single CPU act in some ways like two chips, have been 
demonstrated. 

The incessant demand for computing power to enable accurate simulation of 
complex phenomena in science and engineering has resulted in the development of a 
class of general-purpose supersystems designed for extremely high-performance 
throughput, and new paradigms for achieving the high-performance. These include: 

■ Vector/pipeline processing 

■ Parallel processing on multiple (hundreds or thousands) CPUs, and 

■ Multitasking with cache memory microprocessors 
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TOP FIVE SUPERCOMPUTER SITES 


Although the peak performance of the first generation supersystems was less than 
100 Mflop/s, the gigaflop barrier (1 Gflop/s) was passed in 1988/89, and the teraflop 
barrier (1 Tflop/s) in 1996/7. In 1995, the US Department of Energy supported the 
development of three terascale machines through its Accelerated Strategic Computing 
Initiative (ASCI). The three machines are: ASCI Red, with 9,472 Intel Pentium II Xeon 
processors - 2.379 Tflop/s at Sandia National Labs; ASCI Blue Mountain with 5,856 
IBM PowerPC 604E processors - 1.608 Tflop/s at Los Alamos National Labs; and ASCI 
White with 8,192 IBM Power 3-II processors - 7.226 Tflop/s at Lawrence Livermore 
National Lab. 

To date, there are over 17 terascale machines worldwide. The maximum 
performance reported today is 35.86 Tflop/s of the Earth Simulator at Kanazawa, Japan, 
which consists of 5,104 vector processors (with peak performance of 40 Tflop/s). 

The top five supercomputer sites, based on the Linpack benchmark are shown in 
Figure 5. 
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PETAFLOP SUPERSYSTEM 


In December 1999, IBM announced a five year effort to build a petaflop (10 15 
Flop/s) supersystem - The Blue Gene Project. The project has the two primary goals of 
advancing the state of the art of biomolecular simulation, and computer design of 
extremely large-scale systems. Two systems are planned: Blue Gene/L, in collaboration 
with Lawrence Livermore National Lab, which leverages high speed interconnect and 
system-on-a-chip technologies and has a peak performance of 200 Tflop/s; and Blue 
Gene/P, the petaflop-scale system. The system will consist of more than one million 
processors, each capable of one billion operations per second. Thirty-two of these ultra- 
fast processors will be placed on single chip (32 Gflop/s). A compact two-foot by two- 
foot board containing 64 of these chips will be capable of 2 Tflop/s. Eight of the boards 
will be placed in 6-foot high racks (16 Tflop/s) and the final system will consist of 64 
racks linked together to achieve the one Pflop/s performance (Figure 6). 
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EVOLUTION OF HUMAN-COMPUTER INTERFACES 


Figure 7 shoes the evolution of human-computer interfaces. During the period of 
1940’s through 1970’s, static interfaces for main frames were used in the form of teletype 
style. This was followed in the 1980’s by more flexible interfaces for PCs - Windows, 
mouse and graphical tablet. With many computing devices available for single users, 
adaptive interfaces with more functionality and communication became available. The 
emergence of grid/pervasive computing paradigms is providing an impetus for intelligent 
neural, perceptual, attentive and other advanced interfaces which integrate adaptive 
interfaces with intelligent agents for making intelligent help and tutoring available to the 
user. 
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EMERGING COMPUTING PARADIGMS 


The rapidly increasing power of computers and networks, and the trend of 
computers getting smaller, along with the increasing complexity of computing systems 
and the associated cost to manage them, led to three emerging computing paradigms, 
namely (Figure 8), 

■ Grid Computing, 

■ Ubiquitous/Pervasive Computing, and, 

■ Autonomic Computing 

The three paradigms are described subsequently. 



Figure 8 
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GRID COMPUTING 


The rapidly increasing power of computers and networks in the 1990s, led the 
new paradigm of distributed computing. A flurry of experiments were conducted on 
“peer-to-peer” computing, all devoted to harnessing the computer power and storage 
capacity of idle desktop machines. These included cluster computing - using networks of 
standard single-processor workstations to solve single problems. At the same time, the 
high-performance computer community began the more ambitious experiments in 
metacomputing. The objective of Metacomputing was to make many distributed 
computers function like one giant computer - metasystem (e.g., the virtual national 
machine). Metasystems give users the illusion that the files, databases, computers and 
external devices they can reach over a network constitute one giant transparent 
computational environment. 

The term grid computing is now used to refer to massive integration of computer 
systems to offer performance unattainable by a single machine. It provides pervasive, 
dependable, consistent, and inexpensive access to facilities and services that live in 
cyberspace, assembling and reassembling them on the fly to meet specified needs (Figure 

9). 



Figure 9 
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GRID TECHNOLOGIES AND INFRASTRUCTURE 


The essential building blocks of grid computing are: Fast processors, parallel 
computer architectures, advanced optical networks, communication protocols, distributed 
software structures and security mechanisms (Figure 10). 

Grid technologies enable the clustering of a wide variety of geographically 
distributed resources, such as high-performance computers, storage systems, data 
sources, special devices and services that can be used as a unified resource. 

Although grid technologies are currently distinct from other major technology 
trends, such as internet, enterprise, distributed, and peer-to-peer computing, these other 
trends can benefit significantly from growing into the problem spaces addressed by grid 
technologies. 
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GRID COMPUTING PROJECTS 


Once the concept of grid computing was introduced, several grid computing 
projects were launched all over the world. A sampling of grid computing projects are 
listed in Figure 11. In the future, grids of every size will be interlinked. The 
“supemodes” like TeraGrid will be networked clusters of supersystems serving users on a 
national or international scale. Still more numerous will be the millions of individual 
nodes: personal machines that users plug into the grid to tap its power as needed. With 
wireless networks and miniaturization of components, that can evolve into billions of 
sensors, actuators and embedded processors as micronodes. 
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UBIQUITOUS / PERVASIVE COMPUTING 


The trend of computers getting smaller is likely to lead to an environment with 
computing functionality embedded in physical devices that are widely distributed and 
connected in a wireless web. 

In a seminal article written in 1991, Mark Weiser described a hypothetical world 
in which humans and computers were seamlessly united. This vision was referred to as 
ubiquitous computing. Its essence was the creation of environment saturated with 
computing and communication, yet gracefully integrated with human users. 

In the mid-1990s, the term pervasive computing came to represent the same vision as that 
described by Weiser. 

The key components of ubiquitous/pervasive computing are (Figure 12): 

■ Pervasive devices, including: 

- Small, low-powered hardware (CPU, storage, display devices, sensors), 

- Devices that come in different sizes for different purposes, and 

- Devices that are aware of their environment, their users, and their 
locations, 

■ Pervasive communication - a high degree of communication among devices, 
sensors and users provided by ubiquitous and secure network infrastructure 
(wireless and wired) and mobile computing, 

■ Pervasive interaction - more natural and human modes of interacting with 
information technology, and 

■ Flexible, adaptable distributed systems - dynamic configuration, 
functionality on demand, mobile agents and mobile resources 
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PERVASIVE COMPUTING FRAMEWORK 


The technological advances necessary to build a pervasive computing 
environment fall into four broad areas (Figure 13): devices, networking, middleware and 
applications. Middleware mediates interactions with the networking kernel on the user’s 
behalf and keeps users immersed in the pervasive computing space. The middleware 
consists mostly of firmware and software 
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PERVASIVE COMPUTING INITIATIVES 


A list of some of the pervasive computing initiatives is given in Figure 14. These 
include university initiative (AURA of Carnegie Mellon University, Endeavor of the 
University of California at Berkeley, the Oxygen Project of MIT, and Portolano Project 
of the University of Washington); Industry/university initiatives (Sentient Computing, a 
joint project of AT&T Laboratories and Cambridge University in the UK); and industry 
projects (Cooltown of Hewlett-Packard, EasyLiving of Microsoft Research Vision Group 
and WebSphere Everyplace of IBM). 
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AUTONOMIC COMPUTING 


The increasing capacity and complexity of the emerging computing systems, and 
the associated cost to manage them, combined with a shortage of skilled workforce are 
providing the motivation for a paradigm shift to systems that are self-managing, self- 
optimizing, and do not require the expensive management services needed today. A 
useful biological metaphor is found in the autonomic nervous system of the human body 
- it tells the heart how many times to beat, monitors the body temperature, and adjusts the 
blood flow, but most significantly, it does all this without any conscious recognition or 
effort on the part of the person - hence the name autonomic computing was coined. 

Autonomic computing is a new research area led by IBM focusing on making 
computing systems smarter and easier to administer. Many of its concepts are modelled 
on self-regulating biological systems. 

Autonomic computing is envisioned to include the ability of the system to 
respond to problems, repair faults and recover from system outages without the need for 
human intervention. An autonomic computing system consists of a large collection of 
computing engines, storage devices, visualization facilities, operating systems, 
middleware and application software (Figure 15). 
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CHARACTERISTICS OF AUTONOMIC COMPUTING 


Autonomic computing is envisioned to combine the following seven 
characteristics (Figure 16): 

1 . Self-defining - Has detailed knowledge of its components, current status, 
ultimate capacity and performance, and all connections to other systems. 

2. Self-configuring - can configure and reconfigure itself under varying and 
unpredictable conditions. System configuration or setup must occur 
automatically, as must dynamic adjustments to that configuration to handle 
changing environments. 

3. Self-optimizing - never settles for status quo. Always looks for ways to 
optimize its performance. Monitors constituent parts, and metrics, using 
advanced feedback control mechanisms and makes changes (e.g., fine-tune 
workflow) to achieve predetermined system goals. 

4. Self-healing - able to recover from routine and extraordinary events that 
might cause some components to malfunction or damage. It must be able to 
discover problems, reconfigure the system to keep functioning smoothly. 

5. Self-protecting - detect, identify and protect itself against various types of 
failure. Maintains overall system security and integrity. 

6. Contextually Aware - This is almost self-optimization turned outward. The 
system must know the environment and the context of the surrounding 
activity, and adapts itself (in real-time) accordingly. 

7. Anticipatory - anticipates the optimized resources, configuration, and 
components needed. 
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FUTURE COMPUTING ALTERNATIVES 


Silicon-based technology is expected to reach its physical limits in the next 
decades. But silicon and computing are not inextricably linked, although they often seem 
to be. For example, when silicon microelectronics reaches ultimate physical limits a 
number of new approaches and technologies have already been proposed. These include 
(Figure 17): 

■ Quantum computing, 

■ Molecular computing, 

■ Chemical and biochemical computing, 

■ DNA computing, and 

■ Optical and optoelectronic computing 

None of these approaches is ready to serve as an all-purpose replacement for 
silicon. In fact, some approaches may be only appropriate as specialized methods in 
particular niches, such as high-level cryptography. 
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FUTURE COMPUTING ENVIRONMENT 


Significant advances continue to be made in the entire spectrum of computing and 
communication technologies. Speculations about the future of computers and computing 
have been attempted in several monographs. Herein, only the emerging trends are 
identified, which include (Figure 18): 

■ An evolving computing paradigm combining ubiquitous / mobile / cognitive / 
autonomic computing and including: 

- Smart, self-regulating computing systems covering a spectrum of 
handheld, embedded and wearable information appliances and devices 

- Wide range of devices to sense, influence and control the physical world 

- Optical networks supplement by wireless communication 

■ Human-computer symbiosis characterized by: 

- Natural cooperative human-machine collaboration 

- Intelligent affective technologies to allow computers to know user’s 
emotional states 

- Humans, sensors and computing devices seamlessly united 

■ Hierarchical knowledge nets: 

- Computer- supported distributed collaboration 

- Augmented / mixed reality and tele-immersion facilities 

- Advanced modeling, simulation and multisensory visualization 
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EXAMPLES OF FUTURE AEROSPACE SYSTEMS AND SOME OF THEIR 

CHARACTERISTICS 


The realization of NASA’s ambitious goals in aeronautics and space with the 
current national budget constraints will require new kinds of aerospace systems and 
missions that use novel technologies and manage risk in new ways. Future aerospace 
systems must be autonomous, evolvable, resilient, and highly distributed. Two examples 
are given in Figure 19. The first is a biologically inspired aircraft with self-healing wings 
that flex and react like living organisms. It is built of a multifunctional material with 
fully integrated sensing and actuation, and unprecedented levels of aerodynamic 
efficiencies and aircraft control. The second is an integrated human-robotic outpost, with 
biologically inspired robots. The robots could enhance the astronaut’s capabilities to do 
large-scale mapping, detailed exploration of regions of interest, and automated sampling 
of rocks and soil. They could enhance the safety of the astronauts by alerting them to 
mistakes before they are made, and letting them know when they are showing signs of 
fatigue, even if they are not aware of it. 
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ENABLING TECHNOLOGIES FOR FUTURE AEROSPACE SYSTEMS 


The characteristics of future aerospace systems identified in Figure 18 are highly 
coupled and require the synergistic coupling of the revolutionary and other leading-edge 
technologies listed in Figure 20. The four revolutionary technologies are 
nanotechnology, biotechnology, information/knowledge technology, and cognitive 
systems technology. The other leading-edge technologies are high-productivity 
computing; high-capacity communication; multiscale modeling, simulation and 
visualization; virtual product development; intelligent software agents; reliability and risk 
management; human performance, and human-computer symbiosis. 
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THREE NASA INITIATIVES 


The realization of NASA’s ambitious goals will require a diverse, technically 
skilled workforce - a new generation of scientists and engineers who can work across 
traditional disciplines and perform in a rapidly changing environment. 

NASA has developed a number of new initiatives for assured workforce 
development. These include University Research, Engineering, and Technology 
Institutes (URETIs), the National Institute of Aerospace (NIA), and the Hierarchical 
Research and Learning Network (HRLN) (see Figure 21). The overall goal of these 
activities is to strengthen NASA’s ties to the academic community through long-term 
sustained investment in areas of innovative and long-term technology critical to future 
aerospace systems and missions. At the same time, the three activities will enhance and 
broaden the capability of the nation’s universities to meet the needs of NASA’s science 
and technology programs. 



Figure 21 
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HIERARCHICAL RESEARCH AND LEARNING NETWORK 


The Hierarchical Research and Learning Network (HRLN) is a pathfinder project 
for the future aerospace workforce development. It aims at creating knowledge 
organizations in revolutionary technology areas which enable collective intelligence, 
innovation and creativity to bear on the increasing complexity of future aerospace 
systems. This is accomplished by building research and learning networks linking 
diverse interdisciplinary teams from NASA and other government agencies with 
universities, industry, technology providers, and professional societies (Figure 22) in 
each of the revolutionary technology areas and integrating them into the HRLN. 

HRLN is envisioned as a neural network of networks. It is being developed by 
eight university teams, led by Old Dominion University’s Center for Advanced 
Engineering Environments. 
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IMPLEMENT AION OF HIERARCHICAL RESEARCH 
AND LEARNING NETWORK 


The phases of implementing HRLN are shown in Figure 23. The first phase 
involves development of learning modules and interactive virtual classrooms in 
revolutionary technology areas, simulators of unique test facilities at NASA, and a 
telescience system - an online multi-site lab that allows real-time exchange of 
information and remote operation of instrumentation by geographically distributed teams. 
These facilities will be integrated into adaptive web learning portals in the second phase, 
which evolve into robust learning networks. In the final phase, the learning networks are 
integrated into the HRLN. 
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ADAPTIVE WEB LEARNING PORTAL 


The Adaptive Web Learning Portal being developed as part of the HRLN project 
has the following major components (Figure 24): 

■ Advanced multimodal interfaces, 

■ Knowledge repository, 

■ Blended learning environment incorporating the three environments: expert- 
managed, self-paced, and collaborative, 

■ Learning management system, and 

■ Customized collaboration infrastructure 
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INTELLIGENT DESIGN ENVIRONMENT 


The future design environment will enable collaborative distributed synthesis to 
be performed by geographically dispersed interdisciplinary/multidisciplinary teams. It 
will include flexible and dynamic roomware (active spaces/collaboration landscape) 
facilities consisting of (Figure 25): 

■ Portable and stationary information devices 

■ Novel multiuser smart displays 

■ Telepresence and other distributed collaboration facilities 

■ Novel forms of multimodal human/network interfaces 

■ Middleware infrastructures and intelligent software agents 
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OBJECTIVES AND FORMAT OF WORKSHOP 


The objectives of the workshop are to (Figure 26): a) provide broad overviews of 
the diverse activities related to new computing paradigms, including grid computing, 
pervasive computing, high-productivity computing, and the IBM-led autonomic 
computing; and b) identify future directions for research that have high potential for 
future aerospace workforce environments. The format included twenty half-hour 
presentations in nine sessions, and three exhibits. 


• Objectives: 

- Overview of diverse activities related to 
emerging/new computing paradigms 

- Identify future directions for research for 
future aerospace workforce environments 

• Format: 

- 20 presentations; 9 sessions 

- 3 exhibits 

• Proceedings: 

- NASA Conference Proceeding 
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INFORMATION ON EMERGING / NOVEL COMPUTING PARADIGMS AND 
FUTURE COMPUTING ENVIRONMENTS 


A short list of books, monographs, conference proceedings, survey papers and 

websites on emerging/novel computing paradigms and future computing environment is 

given subsequently. 

Books, Monographs and Conference Proceedings : 

[1] Fogg, B.J., “ Persuasive Technology - Using Computers to Change What We 
Think and Do ", Morgan Kaufmann Publishers, 2003. 

[2] Grigoras, D., Nicolau, A., Toursel, B. and Folliot, B., (editors), “ Advanced 
Environments, Tools, and Applications for Cluster Computing ", NATO 
Advanced Research Workshop, Iwcc 2001, Mangalis, Romania, September 2001: 
Revised Papers, Springer- Verlag New York, Inc., May 2002. 

[3] Carroll, J.M. (editor), “ Human-Computer Interaction in the New Millennium ", 
ACM Press, New York, 2002. 

[4] Luryi, S., Xu, J., and Zaslavsky, A., “ Future Trends in Microelectronics: The 
Road Ahead", John Wiley and Sons, Inc., 2002. 

[5] Calude, C.S., and Paun, G., “ Computing with Cells and Atoms: An Introduction to 
Quantum, DNA and Membrane Computing ", Taylor & Francis, 2001. 

[6] Barkai, D., “ Peer-to-Peer Computing - Technologies for Sharing and 
Collaborating on the Net", Intel Press, 2001. 

[7] Nielsen, M.A. and Chuang, I.L., “ Quantum Computation and Quantum 
Information" , Cambridge University Press, 2001. 

[8] Hirvensalo, M., “ Quantum Computing ", Springer- Verlag TELOS, 2001. 

[9] Kim, D., and Hariri, S., “ Virtual Computing: Concept, Design, and Evaluation" , 
Kluwer Academic Publishers, August 2001. 

[10] Greenia, M.W. “ History of Computing: An Encyclopedia of the People and 
Machines that Made Computer History". CD-ROM, Lexikon Services, 2000. 

[11] Ceruzzi, P.E., "A History of Modern Computing", Massachusetts Institute of 
Technology, 1999. 

[12] Foster, I., and Kesselman, C., (editors), “ The Grid: Blueprint for a New 

Computing Infrastructure" , Morgan Kaufmann Publishers, Inc., San Francisco, 
CA, 1999. 

[13] Denning, P.J., (editor), “ Talking Back to the Machine: Computers and Human 
Aspiration", Springer- Verlag, New York, 1999. 

[14] Kurzweil, R., "The Age of Spiritual Machines: When Computers Exceed Human 
Intelligence” , Penguin Putnam, Inc., New York, 1999. 

[15] Williams, C.P., and Clearwater, S.H., “ Explorations in Quantum Computing”, 
Springer- Verlag, New York, 1998. 

[16] Paun, G., Rozenberg, G., Salomaa, A., and Brauer, W., “ DNA Computing: New 
Computing Paradigms (Texts in Theoretical Computer Science)", Springer 
Verlag, 1998. 
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[17] Maybury, M.T., and Wahlster, W., (editors), “ Readings in Intelligent User 
Interfaces ”, Morgan Kaufmann Publishers, Inc., 1998. 

[18] Denning, P.J., and Metcalfe, R.M., (editors), “ Beyond Calculation: The Next Fifty 
Years of Computing ”, Springer- Verlag, New York, 1997. 

[19] Williams, M.R., U A History of Computing Technology ”, IEEE Computer Society; 
2nd edition, 1997. 

[20] Shurkin, J., “ Engines of the Min<T\ W.W. Norton & Company, New York. 1996. 

[21] Prince, B., “ High Performance Memories: New Architecture Drams and Srams - 
Evolution and Function ”, Wiley, John & Sons, Incorporated, 1996. 

[22] Wherrett, B.S., and Chavel, P. (editors), “ Optical Computing ”, Proceedings of the 
International Conference, Heriot-Watt University, Edinburgh, U.K., August 22 - 
25, 1994, lop Publishers, March 1995. 

[23] AcAulay, A.D., “ Optical Computer Architectures: The Application of Optical 
Concepts t Next Generation Computers ”, Wiley, John & Sons, Inc., 1991. 

Special Issues of Journals : 

[1] Vertegaal, R. “ Attentive User Interfaces” Editorial, Special Issue on Attentive 
User Interfaces, Communications of ACM 46(3), March 2003. 

[2] “ Limits of Computation ”, Special issue of Computing in Science and 
Engineering, May/June 2002. 

[3] “The Future of Computing - Beyond Silicon”, Special Issue of Technology 
Review, MIT’s Magazine of Innovation, May/June 2000. 

[4] Caulfield, H.J., “ Perspectives in Optical Computing ”, IEEE Computer, Vol. 31, 
No. 2, February 1998, pp. 22-25. 

[5] “ 50 Years of Computing ”, Special issue of IEEE Computer, Vol. 29, No. 10, 
October 1996. 

Survey Papers and Articles : 

[1] Compton, K., and Hauck, S., “ Recon figur able Computing: A Survey of Systems 
and Software ”, ACM Computing Surveys, Vol. 34, No. 2, June 2002, pp. 171- 
210 . 

[2] Waldrop, M.M., “Grid Computing” , Technology Review, May 2002, pp. 31-37. 

[3] Noor, A.K., “ New Computing Systems and Future High-Performance Computing 
Environment and Their Impact on Structural Analysis and Design ”, Computers 
and Structures, Vol. 64, Nos. 1-4, July-August 1997, pp. 1-30. 

[4] Weiser, M., “The Computer of the 21 st Century”, Scientific American, Vol. 265, 
No. 3, September 1991, pp. 66-75. 

Websites : 

1 . MIT Project Oxygen - Pervasive Human-Centered Computing 

http://oxygen.lcs.mit.edu 
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2. Autonomic Computing - Creating Self-Managing Computing Systems 
http://www-3.ibm.com/autonomic 

3. Pervasive Computing - Anywhere. Anytime. On Demand. 
http://www.darpa.mil/ipto/research/hpcs 

4. The Globus Project 
http://www.globus.org 

5. Quantum computation: a tutorial 
http://www.sees.bangor.ac.uk/~schmuel/comp/compt.html 

6. Stanford University, U.C. Berkeley, MIT, and IBM Quantum Computation 
Project 

http://divine.stanford.edu 

7. DNA Computers 

http://members.aol.com/ibrandt/dna_computer.html 

8. Publications on DNA based Computers 
http://crvpto.stanford.edu/~dabo/biocomp.html 

9. European Molecular Computing Consortium (EMCC) 
http://openit.disco.unimib.it/emcc 
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AUTONOMIC COMPUTING - AN OVERVIEW 


This presentation covers the autonomic computing vision and initiative. In 2001, 
Paul Horn, IBM’s Senior VP of Research, issued a “grand challenge” to the computing 
industry to develop more self-managing systems. Since that time, IBM has been working 
internally and externally to advance the state of its products and the industry to 
autonomic computing capabilities. IBM has a corporate initiative to drive autonomic 
computing. Ric Telford is the Director of Architecture and Technology for this initiative. 



Autonomic Computing: 
An Overview 


Ric Telford 

Director, Architecture and Technology 
Autonomic Computing, IBM 
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THE E-BUSINESS EVOLUTION 


IBM introduced the concept of "e-business on demand" last year to describe the 
next evolution of e-business. We are now at a stage where the need for e-business on 
demand is becoming the priority for companies, e-business on demand allows for a more 
loosely-coupled, service-oriented approach to e-business infrastructures. It allows for 
systems to combine and separate as required in real-time, to address the business 
problems at hand. It allows for the leveraging of systems not only within a data center, 
but across multiple data centers, other businesses, and service providers. This is the 
vision of computing - being able to construct systems easier, and being able to run these 
systems with minimal downtime and minimal human intervention. IBM realized a while 
back that we need some new, fundamental technologies to make this vision of dynamic e- 
business real. Specifically, the ability for IT systems to manage themselves - react to 
problems, re-configure based on load, guard against attacks and continually optimize 
based on set policies. This is autonomic computing. 
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AUTONOMIC COMPUTING AS PART OF E-BUSINESS ON DEMAND 


An on-demand business has some key qualities - the ability to respond to 
requests/demands in real-time. The ability to have variable cost structures in IT vs. the 
static (fixed) cost structures of today. The ability to have the time-consuming tasks of 
managing an IT system be more self-managed, allowing for IT professionals to work on 
that which is core and differentiating for the business. Finally, the ability for a system to 
be resilient and highly available. 

To build such a system, you need an operating environment with some key 
capabilities - the ability to integrate across systems, open standards-based, a virtualized 
infrastructure and of course, autonomic computing capabilities. 
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Integrated 
■ Open 
Virtualized 

Autonomic 
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WHY AUTONOMIC COMPUTING? 


The term Autonomic Computing comes from the human autonomic nervous 
system. The autonomic system “self-manages” the body (heart rate, breathing, etc). 
Computing systems need to be able to do the same. 



Why Autonomic Computing? 


Autonomic nervous 
system governs body 
functions. 


"Intelligent" Systems that... 

■"Know" themselves 
■Continuously optimize 
■Recover from routine/extraordinary events 
■Anticipate and adapt to user needs 
■Protect against attacks threatening the system 
■Understand the external environment in which they operate 
■Support heterogeneous environments via open standards 
■Configure and re-configure under varying and unpredictable conditions 


Autonomic Computing ©2003 IBM Corporation 
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COMPLEXITY 


The need for autonomic computing is required today due to the ever growing 
complexity of systems. Keeping systems properly configured, optimized and running is a 
very labor intensive task. Autonomic computing is focused at reducing this complexity 
in the IT infrastructure. 



Complexity 



Complex heterogeneous infrastructures are a reality! 
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MARKETPLACE DRIVERS FOR AUTONOMIC COMPUTING 


Although complexity is the most 
there are others as well. 


significant driver for 


autonomic 


computing, 



Marketplace Drivers for Autonomic Computing 

■ Complexity 

Exponential growth in the number and variety of systems and 
components 

Heterogeneous and distributed interconnected environments 

Unpredictable, highly variable workloads 

■ IT has become mission critical 

User expectation of 24/7 availability 

New business models totally dependent on IT; cost of downtime is 
huge 

■ Total Cost of Ownership 

Economy is forcing more pressure on ROI models and operational 
cost reductions 

Human costs exceed technology costs 

■ IT Skills 

Scarcity of IT professionals to install, configure, optimize and 
maintain the heterogeneous IT systems 
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WHAT IS AUTONOMIC COMPUTING? 


We talk about self-managing systems in four areas: 1) self configuring, or the 
ability to understand the environment and configure accordingly, 2) self-healing, which is 
the ability for systems to determine problems and workaround or fix the problem, 3) self- 
optimizing, the ability to re-configure based on changing conditions and 4) self- 
protecting which is the ability to guard against external threats. 



What Is Autonomic Computing? 

Self-managing systems that ... 


Increase 

Responsiveness 



Business Resiliency 


Adapt to dynamically 
changing environments 


Discover, diagnose, and 
act to prevent 
\ disruptions 


Discover, diagnose, and 


Operational Efficiency 

Tune resources and 
balance workloads to 
maximize use of IT 
resources 


Secure Information 
and Resources 


Anticipate, detect, 
identify, and protect 
against attacks 


Evolutionary Path ... 


■ 



Figure 7 
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EVOLUTION; NOT REVOLUTION 


A fully autonomic system is something you evolve to from where you are today. 
This chart lays out the progression from a manual IT infrastructure to a fully autonomic 
one. 


Evolution; Not Revolution 



Basic 

Level 1 

Managed 

Level 2 

Predictive 

Level 3 

Adaptive 

Level 4 

Autonomic 

Level 5 

Multiple 
sources of 
system 

generated data 

Consolidation 
of data and 
actions through 
management 
tools 

System 
monitors, 
correlates and 
recommends 
actions 

System 
monitors, 
correlates and 
takes action 

Integrated 
components 
dynamically 
managed by 
business 
rules/policies 

Requires 
extensive, 
highly skilled 
IT staff 

IT staff 
analyzes and 
takes actions 

IT staff 

approves and 
initiates actions 

IT staff 
manages 
performance 
against SLAs 

IT staff focuses 
on enabling 
business needs 


Greater system 
awareness 
Improved 
productivity 

Reduced 
dependency on 
deep skills 
Faster/better 
decision making 

Balanced 
human/system 
interaction 
IT agility and 
resiliency 

Business policy 
drives IT 
management 
Business agility 
and resiliency 


Manual Autonomic 

^ i - 
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AN ANALOGY - CAMERA “AUTONOTMICS” 


There is an analogy here to the camera industry. Over time, cameras evolved 
from a very manual set of functions to a highly automated set. One point of note here - 
even though the automated capabilities exist in the camera industry, it is still possible to 
configure the camera manually. This will be true in autonomic computing systems as 
well. 



Figure 9 
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AUTONOMIC COMPUTING REFERENCE MODEL 


This diagram is a model for what a n autonomic manager looks like. It requires 
elements of a system (database, server, storage, applications) to expose a set of “sensors” 
(state information on the element) and “effectors” (interfaces for tuning, configuring, 
changing state, etc). Given any set of sensors and effectors, an autonomic manager can 
be built which monitors the sensors, analyzes the data, compares the existing state to the 
desired state (rendered as “knowledge”) and then set a plan and execution for change. 



Autonomic Computing Reference Model 




Figure 10 


46 



MULTIPLE CONTEXTS FOR AUTONOMIC BEHAVIOR 


Autonomic managers can exist at many layers in the system. The challenge is to 
coordinate the behaviors of the AC systems. 



Multiple Contexts for Autonomic Behavior 


Autonomic elements 
monitor activities and 
take actions to adjust 
the system to meet 
business objectives 

The value of autonomic 
computing increases 
when components work 
together 

Autonomic elements 
learn from past 
experience to build 
action plans 
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SELF - CONFIGURING EXAMPLE 


This is a very good example of a self-configuring system. The Configuration 
Advisor in DB2 can self-configure a database system. Often the results of the 
Configuration Advisor are as good as, if not better, than a human database administrator. 


Self-Configuring Example 



• Speeds deployment 
• Improves performance 
• Frees up resource 



DB2 


Data Management Software 


DB2 Configuration Advisor Results 



OLTP - 32 OLTP - 64 Oust #1 Oust #2 

DBA Advisor as Default 

tuned % of tuned configuration 
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SELF-PROTECTING EXAMPLE 


Tivoli’s Risk Manager is an example of a self-protecting system. By monitoring 
and correlating data from across the infrastructure, Risk Manager can determine if there 
are external threats to the system. 


Self-Protecting Example 


Automate incident response 


software 




Risk Correlation Risk Mgr 

Manager Engine IDS Rules 

Security Event 



IL Event Intrusion Detection 

1 Database System (IDS) 



II 

n - 

Intrusion 

Detection 



Intranet 


Application 

Server Intri 

_ D ete 

I Firewall 

T I 


Web 

Server Router 


Internet 


• Help prevent service 
disruptions 

• Protect systems and data 
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AUTONOMIC COMPUTING WILL IMPACT IT PROCESSES 


Finally, it is important to note that autonomic computing will greatly help in the 
automation of processes in an IT infrastructure. 



Figure 14 
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Towards Autonomic Computational Science & Engineering 
(Autonomic Computing: Application Perspective) 


Manish Parashar 

The Applied Software Systems Laboratory 
ECE/CAIP, Rutgers University 
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THE CURRENT TEAM 


The AutoMate team is composed of faculty and graduate and undergraduate 
students at The Applied Software Systems Laboratory, Department of Electrical and 
Computer Engineering and Center of Advanced Information Processing (CAIP), Rutgers, 
The State University of New Jersey. The team is organized as the Autonomic Computing 
Research Group and the Autonomic Applications Research Group. This research builds 
on our collaborations with application scientists, engineers and computer and 
computational scientists at California Institute of Technology, University of Texas at 
Austin, University of Arizona, Ohio State University, and University of Maryland. 


The Current Team 


TASSL Rutgers University 


CS Collaborators 


- Autonomic Computing Research 
Group 

• Vi raj Bhat 

• Manish Agarwal 

• Hua Liu (Maria) 

• Zhen Li (Jenny) 

• Manish Mahajan 

• Vincent Matossian 

• Venkatesh Putty 

• Cristina Schmidt 

• Guangsen Zhang 


- HP DC, University of Arizona 
• Salim Hariri 


- Biomedical Informatics, The Ohio 
State University 
• Tahsin Kurc, Joel Saltz 


- CS, University of Maryland 

• Alan Sussman, Christian Hansen 


Applications Collaborators 


- CSM, University of Texas at Austin 
• Malgorzata Peszynska, Mary 


Wheeler 


- IG, University of Texas at Austin 
• Mrinal Sen, Paul Stoffa 


- Autonomic Applications 
Research Group 

• Sumir Chandra 

• Xiaolin Li 

• Taher Saif 

• Li Zhang 

• Hailan Zhu 


- ASCI/CACR, Caltech 

• Michael Aivazis, Julian Cummings, 


Dan Meiron 


- CRL, Sandia National Laboratory, 
Livermore 

• Jaideep Ray, Johan Steensland 



Figure 1 


53 


OVERVIEW OF THE TALK 


This talk motivates and introduces autonomic computational science and 
engineering, and presents the AutoMate framework for enabling autonomic applications 
on Grid. It describes the AutoMate architecture and briefly presents each of its 
components. These include the ACCORD autonomic component framework, the 
RUDDER decentralized deductive engine, the SESAME context-sensitive dynamic 
access management framework, the Pawn peer-to-peer messaging substrate, and the 
SQUID decentralized discovery service. Finally, it describes two applications of 
autonomic computing to science and engineering - autonomic runtime management 
framework for adaptive applications (V-Grid) and autonomic interactions for oil reservoir 
optimization. 


Outline 


• Autonomic computational science and engineering 

• AutoMate: A framework of enabling autonomic 
applications 

- ACCORD: Autonomic component framework 

- RUDDER: Decentralized deductive engine 

- SESAME: Context sensitive dynamics access management 

- Pawn: Peer-to-Peer messaging infrastructure 

- SQUID: Decentralized discovery service 

• Application Scenarios 

- V-Grid autonomic runtime for adaptive applications 

• reactive/proactive partitioning, load-balancing, scheduling, 
performance management 

- Autonomic interactions oil reservoir optimization 

• Conclusions and current status 



Figure 2 
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COMPUTATION MODELING OF PHYSICAL PHENOMENA 


Realistic, physically accurate simulations of complex physical phenomena that 
symbiotically and opportunistically combine computations, experiments, observations, 
and real-time data have the potential for providing dramatic insights into complex 
systems such as interacting black holes and neutron stars, formations of galaxies, 
subsurface flows in oil reservoirs and aquifers, and dynamic response of materials to 
detonations. However, the phenomena being modeled by these applications are inherently 
large-scale, dynamic and heterogeneous (in time, space, and state). Furthermore, the 
applications are extremely large with unprecedented resource requirements, and are 
composed of a large numbers of software components with very dynamic compositions 
and interactions between these components. 


Computational Modeling of Physical Phenomenon 


• Realistic, physically accurate computational modeling 

- Large computation requirements 

• e.g. simulation of the core-collapse of supernovae in 3D with reasonable 
resolution (500 3 ) would require ~ 10-20 teraflops for 1.5 months (i.e. ~100 
Million CPUs!) and about 200 terabytes of storage 

• e.g. turbulent flow simulations using active flow control in aerospace and 
biomedical engineering requires 5000x1 000x500=2.5- 109 points and 
approximately 107 time steps , i.e. with IGFlop processors requires a 
runtime of ~7- 106 CPU hours, or about one month on 10,000 CPUs! (with 
perfect speedup). Also with 700B/pt the memory requirement is ~ 1.75TB 
of run time memory and ~800TB of storage. 

- Coupled, multiphase, heterogeneous, dynamic 

• multi-physics, multi-model, multi-resolution, .... 

- Complex interactions 

• application - application, application - resource, application - data, 
application - user, . . . 

- Software/systems engineering/programmability 

• volume and complexity of code, community of developers, . . . 

- scores of models, hundreds of components, millions of lines of code, . . . 
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COMPUTATION MODELING AND THE GRID 


The emergence of computational Grids and the potential for seamless 
aggregation, integration and interactions has made it possible to conceive the realistic, 
scientific and engineering simulations of complex physical phenomena described in the 
previous slide. However, the Grid infrastructure is also heterogeneous and dynamic, 
globally aggregating large numbers of independent computing and communication 
resources, data stores and sensor networks. The combination of the two (large, complex, 
heterogeneous and dynamic applications and Grids) results in application development, 
configuration and management complexities that break current paradigms based on 
passive components and static compositions. Clearly, there is a need for a fundamental 
change in how these applications are formulated, composed and managed so that their 
heterogeneity and dynamics can match and exploit the heterogeneous and dynamic nature 
of the Grid. In fact, we have reached a level of complexity, heterogeneity, and dynamism 
for which our programming environments and infrastructure are becoming unmanageable 
brittle and insecure. This has led researchers to consider alternative programming 
paradigms and management techniques that are based on strategies used by biological 
systems to deal with complexity, heterogeneity and uncertainty. The approach is referred 
to as autonomic computing. An autonomic computing system is one that has the 
capabilities of being self-defining, self-healing, self-configuring, self-optimizing, self- 
protecting, context aware, and open. 


Computational Modeling and the Grid 


• The Computational Grid 

- Potential for aggregating resources 

• computational requirements 

- Potential for seamless interactions 

• new applications formulations 

• Developing application to utilize and exploit the Grid remains a significant 
challenge 

- The problem: a level of complexity, heterogeneity, and dynamism for which our 
programming environments and infrastructure are becoming unmanageable, brittle and 
insecure 

• System size, heterogeneity, dynamics, reliability, availability, usability 

• Currently typically proof-of-concept demos by “hero programmers” 

- Requires fundamental changes in how applications are formulated, composed and 
managed 

• Breaks current paradigms based on passive components and static compositions 

• autonomic components and their dynamic composition, opportunistic interactions, virtual 
runtime, ... 

- Resonance - heterogeneity and dynamics must match and exploit the heterogeneous 
and dynamic nature of the Grid 

• Autonomic, adaptive, interactive simulations and the Grid offer the potential for 
such simulations 

- Autonomic: context aware, self configuring, self adapting, self optimizing, self healing,... 

- Adaptive: resolution, algorithms, execution, scheduling, . . . 

- Interactive: peer interactions between computational objects and users, data, 
resources, ... 
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AUTOMATE 


The overall objective of the AutoMate project is to investigate key technologies to 
enable the development of autonomic Grid applications that are context aware and are 
capable of self-configuring, self-composing, self-optimizing and self-adapting. 
Specifically, it will investigate the definition of autonomic components, the development 
of autonomic applications as dynamic composition of autonomic components, and the 
design of key enhancements to existing Grid middleware and runtime services to support 
these applications. 

Definition of Autonomic Components: The definition of programming abstractions and 
supporting infrastructure that will enable the definition of autonomic components. In 
addition to the interfaces exported by traditional components, autonomic components 
provide enhanced profiles or contracts that encapsulate their functional, operational, and 
control aspects. These aspects export information and policies about their behavior, 
resource requirements, performance, interactivity and adaptability to system and 
application dynamics. Furthermore, they encapsulate sensors, actuators, access policies 
and a policy-engine. Together, aspects, policies, and policy engine allow autonomic 
components to consistently configure, manage, adapt and optimize their execution. 
Dynamic Composition of Autonomic Applications: The development of mechanisms 
and supporting infrastructure to enable autonomic applications to be dynamically and 
opportunistically composed from autonomic components. The composition will be based 
on policies and constraints that are defined, deployed and executed at run time, and will 
be aware of available Grid resources (systems, services, storage, data) and components, 
and their current states, requirements, and capabilities. 

Autonomic Middleware Services: The design, development, and deployment of key 
services on top of the Grid middleware infrastructure to support autonomic applications. 
One of the key requirements for autonomic behavior and dynamic compositions is the 
ability of the components, applications and resources (systems, services, storage, data) to 
interact as peers. Furthermore the components should be able to sense their environment. 
In this project, we extend the Grid middleware with (1) a peer-to-peer messaging 
substrate, (2) context aware services, and (3) peer-to-peer deductive engines for 
composition, configuration and management of autonomic applications. An active peer- 
to-peer control network will combine sensors, actuators and rules to configure and tune 
components and their execution environment at runtime and to satisfy requirements and 
performance and quality of service constraints. 
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Auto Mate: Enabling Autonomic Applications 


• Objective: 

- Investigate key technologies to enable the development of autonomic Grid applications 
that are context aware and are capable of self-configuring, self-composing, self- 
optimizing and self-adapting. 

• Overview: 

- Definition of Autonomic Components: 

• definition of programming abstractions and supporting infrastructure that will enable the 
definition of autonomic components 

• autonomic components provide enhanced profiles or contracts that encapsulate their 
functional, operational, and control aspects 

- Dynamic Composition of Autonomic Applications: 

• mechanisms and supporting infrastructure to enable autonomic applications to be dynamically 
and opportunistically composed from autonomic components 

• compositions will be based on policies and constraints that are defined, deployed and executed 
at run time, and will be aware of available Grid resources (systems, services, storage, data) 
and components, and their current states, requirements, and capabilities 

- Autonomic Middleware Services: 

• design, development, and deployment of key services on top of the Grid middleware 
infrastructure to support autonomic applications 

• a key requirements for autonomic behavior and dynamic compositions is the ability of the 
components, applications and resources (systems, services, storage, data) to interact as peers 
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AUTOMATE ARCHITECTURE 


AutoMate builds on the emerging Grid infrastructure and extends the Open Grid 
Service Architecture (OGSA). AutoMate is composed of the following components: 
AutoMate System Layer: The AutoMate system layer builds on the Grid middleware 
and OGSA and extends core Grid services (security, information and resource 
management, data management) to support autonomic behavior. Furthermore, this layer 
provides specialized services such as peer-to-peer semantic messaging, events and 
notification. 

AutoMate Component Layer: The AutoMate component layer addresses the definition, 
execution and runtime management of autonomic components. It consists of AutoMate 
components that are capable of self configuration, adaptation and optimization, and 
supporting services such as discovery, factory, lifecycle, context, etc. (which builds on 
core OGSA services). 

AutoMate Application Layer: The AutoMate application layer builds on the component 
and system layers to support the autonomic composition and dynamic (opportunistic) 
interactions between components. 

AutoMate Engines: AutoMate engines are decentralized (peer-to-peer) networks of 
agents in the system. The context-awareness engine is composed of context agents and 
services and provides context information at different levels to trigger autonomic 
behaviors. The deductive engine is composed of rule agents which are part of the 
applications, components, services and resources, and provides the collective decision 
making capability to enable autonomic behavior. Finally, the trust and access control 
engine is composed of access control agents and provides dynamic context-aware control 
to all interactions in the system. 

In addition to these layers, AutoMate portals provide users with secure, pervasive 
(and collaborative) access to the different entities. Using these portals users can access 
resource, monitor, interact with, and steer components, compose and deploy applications, 
configure and deploy rules, etc. AutoMate leverages the experiences and technologies 
developed as part of the Discover/DIOS computational collaboratory project 
(http://www.discoverportal.org). The different components are described in the following 
sections. 
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AUTOMATE ARCHITECTURE 


Key components of AutoMate include: 

• ACCORD (Autonomic Components, Compositions and Coordination) 
component framework that enables the definition of autonomic 
components, their autonomic compositions and opportunistic interactions. 

• RUDDER (Rule Definition Deployment and Execution Service) 
decentralized deductive engine. 

• SESAME (Scalable Environment Sensitive Access Management Engine) 
dynamic access control engine. 

• Pawn decentralized (P2P) messaging substrate. 

• SQUID flexible) information discovery service. 

These components are introduced in the following slides. 


AutoMate: Components 


• ACCORD: Autonomic application framework 

• RUDDER: Decentralized deductive engine 

• SESAME: Dynamic access control engine 

• Pawn: P2P messaging substrate 

• SQUID: P2P discovery service 



Figure 7 
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ACCORD: AUTONOMIC COMPONENTS 


Autonomic components in AutoMate export information and policies about their 
behavior, resource requirements, performance, interactivity, and adaptability to system 
and application dynamics. In addition to the functional interfaces exported by traditional 
components, AutoMate components provide semantically enhanced profiles or contracts 
that encapsulate their functional, operational, and control aspects. A conceptual overview 
of an AutoMate component is presented in the figure. The functional aspect specification 
abstracts component functionality, such as order of interpolation (linear, quadratic, etc.). 
This functional profile is then used by the compositional engine to select appropriate 
components based on application requirements. The operational aspect specification 
abstracts a component's operational behavior, including computational complexity, 
resource requirements, and performance (scalability). This profile is then used by the 
configuration and runtime engines to optimize component selection, mapping and 
adaptation. Finally, the control aspect describes the adaptability of the component and 
defines sensors/actuators and policies for management, interaction and control. 

AutoMate components also encapsulate access policies, rules, a rule agent, and an 
access agent that allow the components to consistently and securely configure, manage, 
adapt and optimize their execution based on rules and access policies. The access agent is 
a part of the AutoMate access control engine and the underling dynamic access control 
model, and manages access to the component based on its current context and state. The 
rule agent is part of RUDDER, the AutoMate deductive engine and manages local rule 
definition, evaluation and execution at the component level. Rules can be dynamically 
defined (and changed) in terms of the component's interfaces (based on access policies) 
and system and environmental parameters. Execution of rules can change the state, 
context and behavior of a component, and can generate events to trigger other rule agents. 


ACCORD: Autonomic Components 


(ACCORD 



• Autonomic components export 
information and policies about their 
behavior, resource requirements, 
performance, interactivity and 
adaptability to system and 
application dynamics 

- functional aspects 

• abstracts component functionality, 
such as order of interpolation (linear, 
quadratic, etc.) 

• used by the compositional engine to 
select appropriate components based 
on application requirements 

- operational aspects 

• abstracts a component's operational 
behavior, including computational 
complexity, resource requirements, 
and performance (scalability) 

• used by the configuration and runtime 
engines to optimize component 
selection, mapping and adaptation 

- control aspect 
describes the adaptability of the 
component and defines 
sensors/actuators and policies for 
management, interaction and control. 




Functional aspect 


Control aspect 


AutoMate components encapsulate 
access policies, rules, a rule agent, and 
an access agent 

- enables components to consistently and 
securely configure, manage, adapt and 
optimize their execution based on rules 
and access policies. 

- rules/polices can be dynamically defined 
(and changed) in terms of the 
component's interfaces (based on 
access policies) and system and 
environmental parameters 

- rule execution may change the state, 
context and behavior of a component, 
and can generate events to trigger other 
rule agents 

- rule agent manages rule execution and 
resolves rule conflicts 


Figure 8 


61 


ACCORD: AUTONOMIC COMPOSITIONS 


Applications are typically composed with well defined objectives. In case of 
autonomic applications, however, these objectives can dynamically change based on the 
state of the application and/or the system. As a result, we need to dynamically select 
components and compose them at runtime based on current objectives. Together, the 
profiles, policies, and rules allow autonomous components to consistently and securely 
manage and optimize their executions. Furthermore, they enable applications to be 
dynamically composed, configured and adapted. Dynamic application work-flows can be 
defined to select the most appropriate components based on user/application constraints 
(highest-performance, lowest cost, reservation, execution time upper bound, best 
accuracy), on the current applications requirements, to dynamically configure the 
component's algorithms and behavior based on available resources or system and/or 
applications state, and to adapt this behavior if necessary. 

The AutoMate dynamic composition model may be viewed as transforming a 
given composition or workflow into a new one by adding or modifying interactions and 
participating entities. Its primary goal is to enable dynamic (and opportunistic) 
choreography and interactions of components and services to react to the heterogeneity 
and dynamics of the application and underlying execution environment to produce the 
desired user objectives. 



Peer-to-Peer, Context Aware Middleware 



Figure 9 
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ACCORD: OPPORTUNISTIC INTERACTIONS 


Opportunistic interactions are decentralized and based on the satisfaction of 
locally defined goals and constraints. These interactions are inherently dynamic and ad- 
hoc and use semantic publisher/subscriber messaging based on proximity, privileges, 
capabilities, context, interests, and offerings. The goals/constraints are typically long- 
term and may or may not be satisfied. The interactions do not involve explicit 
synchronization - the semantics are achieved through feedback and consensus building 
mechanisms. 


ACCORD: Opportunistic Interactions 



• Interactions based on local goals and objectives 

- local goals and objectives are defined as constraints that to be satisfied 

- constraints can updated and new constraints can defined at any time 

• Dynamic and ad-hoc 

- interactions use “semantic messaging” based on proximity, privileges, 
capabilities, context, interests, offerings, etc. 

• Opportunistic 

- constraints are long-term and satisfied opportunistically (may not be satisfied) 

• Probabilistic guarantees and soft state 

- no explicit synchronization 

- interaction semantics are achieved using feedback and consensus building 



Figure 10 


63 


RUDDER: DEDUCTIVE ENGINE 


RUDDER provides the core capabilities for supporting autonomic compositions, 
adaptations, and optimizations. It is a decentralized deductive engine composed of 
distributed specialized agents (component rule agents, composition agents, context agents 
and system agents) that exist at different levels of the system, and represents their 
collective behavior. It provides mechanisms for dynamically defining, configuring, 
modifying and deleting rules. Furthermore it defines an XML schema for composing 
rules and provides mechanisms for deploying and routing rules, decomposing and 
distributing them to relevant agents, and for coordinating the execution of rules. It also 
manages conflict resolutions within a single entity and across entities. 

The figure presents a schematic overview of RUDDER. It builds on AutoMate 
and Grid services and the underlying semantic messaging infrastructure. Rules can be 
dynamically injected into the system and are routed by the messaging substrate to the 
appropriate agents. Furthermore, the agents may hierarchically decompose a rule and 
distribute it to peer agents. For example, an application level rule may be decomposed 
into sub-rules that are assigned to its components. The components rules may be further 
decomposed into rules for the underlying systems entities. 


RUDDER: The AutoMate Deductive Engine 



• RUDDER is a decentralized deductive engine composed of distributed 
specialized agents (component rule agents, composition agents, context 
agents and system agents) that exist at different levels of the system, and 
represents their collective behavior. 

- provides mechanisms for dynamically defining, configuring, modifying and deleting 
rules/polices/constraints 



defines an XML schema for composing rules and provides mechanisms for 
deploying and routing rules, decomposing and distributing them to relevant agents, 
and for coordinating the execution of rules 

manages conflict resolution within a single entity and across entities 

provides the core capabilities for supporting autonomic compositions, adaptations, 

and optimizations 


Figure 11 
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SESAME: CONTEXT SENSITIVE ACCESS MANAGEMENT 


A key requirement of autonomic applications is the support for dynamic, seamless 
and secure interactions between the participating entities, i.e. components, services, 
application, data, instruments, resources and users. Ensuring interaction security requires 
a fine grained access control mechanism. Furthermore, in the highly dynamic and 
heterogeneous Grid environment, the access rights of an entity depend on the entity's 
privileges, capabilities, context and state. For example, the ability of a user to access a 
resource or steer a component depends on users' privileges (e.g. owner), current 
capabilities (e.g. resources available), current context (e.g. secure connection) and the 
state of the resource or component. The AutoMate Access Control Engine addresses 
these issues and provides dynamic access control to users, applications, services, 
components and resources. The engine is composed of access control agents associated 
with various entities in the system. The underlying dynamic role based access control 
mechanism extends the RBAC (Role Based Access Control) model to make access 
control decision based on dynamic context information. The access control engine 
dynamically adjusts Role Assignments and Permission Assignments. 


SESAME: Context Sensitive Access Management 


• Objective: 

- support dynamic , seamless and secure interactions between the 
participating entities (i.e. components, services, application, data, 
instruments, resources and users) 

• Issues: 

- access rights in highly dynamic and heterogeneous Grid environments 
depends on the entity's privileges, capabilities, context and state 

• e.g. the ability of a user to access a resource or steer a component depends on 
users’ privileges (e.g. owner), current capabilities (e.g. resources available), 
current context (e.g. secure connection) and the state of the resource or 

component 


• Approach 

- extend Role Based Access Control (RBAS) to 
make access control decision based on 
dynamic context information 

- dynamically adjust Role Assignments and 
Permission Assignments based on context 



Role , Permission 



Subject component 


Figure 12 
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PAWN: P2P MESSAGING 


Pawn is a peer-to-peer messaging substrate that builds on project JXTA to support 
peer-to-peer interactions on the Grid. Pawn provides a stateful and guaranteed messaging 
to enable key application-level interactions such as synchronous/asynchronous 
communication, dynamic data injection, and remote procedure calls. It exports these 
interaction modalities through services at every step of the scientific investigation 
process, from application deployment, to interactive monitoring and steering, and group 
collaboration. 

A conceptual overview of the Pawn P2P substrate is presented in the figure. Pawn 
is composed of peers (computing, storage, or user peers), network and interaction 
services, and mechanisms. These components are layered to represent the requirements 
stack enabling interactions in a Grid environment. The figure can be read from bottom to 
top as: “Peers compose messages handled by services through specific interaction 
modalities”. 


Pawn: A P2P Messaging Substrate 


• Objective 

- Engineer a peer-to-peer messaging 
substrate that extends existing solutions to 
enable high-level interactions for scientific 
applications. 

• Architecture 

- Peers, Messages, Services, Interactions 

• Key Features 

- Stateful messages 

- Guaranteed messaging semantics 

- Publish/subscribe mechanisms across 
peer-to-peer domains 

- High-level messaging semantics 

• Sync/Async Messaging 

• PUSH (dynamic injection) 

• Pawn RPC 


• Built on Project JXTA 
- Pipes 



Resolver 


Interactions 

Synchronous , 1 Asynchronous: Dynamic Data 
Injection; Remote Procedure Celle 


Services 

Application Execution; Apd k-alion Runtime 
Central; Application Monitoring end 
Slewing, CfUlaborallon 


Messages 

PlalForro-indapendenL: Coordination: 
Guarantees 




Client; R eneteivous; AftMlcalftn 





Figure 13 
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SQUID: DECENTRALIZED Discover 


A fundamental problem in large, decentralized, distributed resource sharing 
environments such as the Grid is the efficient discovery of information, in the absence of 
global knowledge of naming conventions. For example a document is better described by 
keywords than by its filename, a computer by a set of attributes such as CPU type, 
memory, operating system type than by its host name, and a component by its aspects 
than by its instance name. The heterogeneous nature and large volume of data and 
resources, their dynamism (e.g. CPU load) and the dynamism of the Grid make the 
information discovery a challenging problem. An ideal information discovery system has 
to be efficient, fault-tolerant, self-organizing, has to offer guarantees and support flexible 
searches (using keywords, wildcards, range queries). Decentralized peer-to-peer (P2P) 
systems, by their inherent properties (self-organization, fault-tolerance, scalability), 
provide an attractive solution. 

SQUID supports decentralized information discovery in AutoMate. It is a P2P 
system that supports complex queries containing partial keywords, wildcards, and range 
queries, and guarantees that all existing data elements that match a query will be found 
with bounded costs in terms of number of messages and number of nodes involved. The 
key innovation is a dimension reducing indexing scheme that effectively maps the 
multidimensional information space to physical peers. 


SQUID: A Decentralized Discovery Service 


• Overview/Motivation: 


- Efficient information discovery in the absence of global knowledge of naming 
conventions is a fundamental problem in large, decentralized, distributed 
resource sharing environments such as the Grid 

• a document is better described by keywords than by its filename, a computer by a 
set of attributes such as CPU type, memory, operating system type than by its host 
name, and a component by its aspects than by its instance name. 

- Heterogeneous nature and large volume of data and resources, their 
dynamism (e.g. CPU load) and the dynamism of the Grid make the 
information discovery a challenging problem. 

• Key features 

- P2P system that supports complex queries containing partial keywords, 
wildcards, and range queries 

- Guarantees that all existing data elements that match a query will be found 
with bounded costs in terms of number of messages and number of nodes 
involved. 



The system can be used as a complement for current resource discovery 
mechanisms in Computational Grids (to enhance them with range queries) 


Figure 14 
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SQUID OPERATION 


The overall architecture of SQUID is a distributed hash table (DHT), similar to 
typical data lookup systems. The key difference is in the way we map data elements to 
the index space. In existing systems, this is done using consistent hashing to uniformly 
map data element identifiers to indices. As a result, data elements are randomly 
distributed across peers without any notion of locality. Our approach attempts to preserve 
locality while mapping the data elements to the index space. In our system, all data 
elements are described using a sequence of keywords (common words in the case of P2P 
storage systems, or values of globally defined attributes - such as memory and CPU 
frequency - for resource discovery in computational grids). These keywords form a 
multidimensional keyword space where the keywords are the coordinates and the data 
elements are points in the space. Two data elements are “local” if their keywords are 
lexicographically close or they have common keywords. Thus, we map documents that 
are local in this multi-dimensional index space to indices that are local in the 1- 
dimensional index space, which are then mapped to the same node or to nodes that are 
close together in the overlay network. This mapping is derived from a locality-preserving 
mapping called Space Filling Curves (SFC). 

In the current implementation, we use the Hilbert SFC for the mapping, and 
Chord for the overlay network topology. The overall operation of SQUID is presented in 
the figure, (a) shows a 2-dimensional keyword space. The data element “ Document ” is 
described by keywords “ Computer ” and “Network” . (b) shows the mapping of the 2- 
dimensional space to a curve. The query (Oil, *) defines clusters on the curve 
(segments), (c) shows the recursive refinement of query (Oil, *) viewed as a tree. Each 
node is a cluster, and the bold characters are the cluster's prefixes, (d) illustrates the query 
resolution process by embedding the leftmost tree path (solid arrows) and the rightmost 
path (dashed arrows) onto the overlay network topology. 
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V-GRID AUTONOMIC APPLICATION MANAGEMENT 


Truly realistic scientific and engineering simulations require enormous amounts 
of resources that can surpass even the aggregated capacity of the Grid. The V-Grid 
(virtual Grid) infrastructure is an application of autonomic computing to science and 
engineering that is based on the concept of virtualizing grid resources and application 
execution (analogous to virtual memory). The V-Grid autonomic runtime management 
framework allows the implementation of a simulation to be driven by the requirements of 
the science being modeled rather than the size and configuration of the machine that it 
will be run on. 

The autonomic behavior in the V-Grid has three primary aspects: (1) V-Grid 
Monitoring, (2) V-Grid Deduction, and (3) V-Grid Execution. 

The V-Grid monitoring engine is a decentralized entity composed of context 
agents that provides application and system context awareness. Application monitoring 
uses sensors exported by the autonomic components and services and provides 
information about the current state, dynamics and requirements of components and the 
application. System/resource monitoring builds on context information provided by 
OGSA and existing Grid middleware (e.g. NWS, Globus, Autopilot) and extends their 
capabilities to support dynamic monitoring requirements and information aggregation. 

The V-Grid deduction engine uses application/components specifications, context 
and predicted behavior to deduce objective functions and execution and management 
strategies. This includes identifying and characterizing natural regions, defining Virtual 
Computational Units or VCUs that reflect the current state of the application, mapping 
them onto Virtual Resource Units or VRUs based on their specifications, and outlining 
scheduling policies and constraints. This mapping of VCUs onto VRUs exploits the 
spatial, temporal and functional heterogeneity of the application to reduce couplings and 
maximize performance. 

The V-Grid execution engine implements polices and strategies defined by the 
deduction engine using OGSA and autonomic Grid services. The main activities of this 
engine are (1) dynamic reservation and allocation of VRUs, (2) adaptive mapping and 
scheduling of VCUs to VRUs, and VRUs to physical resources, and (3) autonomic 
management, control and adaptation of application execution. 
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SQUID: Operation 




Figure 15 


V-Grid: Autonomic Application Management 
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Figure 16 
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ADAPTIVE MESH REFINEMENT 


Dynamically adaptive mesh refinement (AMR) methods for the numerical 
solution to partial differential equations (PDEs) employ locally optimal approximations, 
and can yield highly advantageous ratios for cost/accuracy when compared to methods 
based upon static uniform approximations. These techniques seek to improve the 
accuracy of the solution by dynamically refining the computational grid in regions with 
large local solution error. 

Structured AMR (SAMR) techniques start with a coarse base grid with minimum 
acceptable resolution that covers the entire computational domain. As the solution 
progresses, regions in the domain with large solution error, requiring additional 
resolution, are identified and refined. Refinement proceeds recursively so that the refined 
regions requiring higher resolution are similarly tagged and even finer grids are overlaid 
on these regions. The resulting grid structure is a dynamic adaptive grid hierarchy (such 
as the SAMR formulation by Berger and Oliger, shown in the figure). 

Methods based on SAMR can lead to computationally efficient implementations 
as they require uniform operations on regular arrays and exhibit structured 
communication patterns. Distributed implementations of these methods, however, lead to 
interesting challenges in dynamic resource allocation, data-distribution, load-balancing, 
and runtime management. 


Adaptive Mesh-Refinement 




Adaptive Mesh Refinement 

•Start with a base coarse grid with 
minimum acceptable resolution 

•Tag regions in the domain requiring 
additional resolution and overlay finer 
grids on the tagged regions of the 
coarse grid 

• Proceed recursively so that regions 
on the finer grid requiring more 
resolution are similarly tagged and 
even finer grids are overlaid on these 
regions 

• Resulting grid structure is a dynamic 
adaptive grid hierarchy 


Figure 17 
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STRUCTURE ADAPTIVE MESH REFINEMENT APPLICATIONS 


Structured adaptive mesh refinement (SAMR) methods are being effectively used 
for adaptive PDE solutions in many domains, including computational fluid dynamics, 
numerical relativity, astrophysics, and subsurface modeling and oil reservoir simulation. 

The top-left application belongs to the Zeus kernel coupled with GrACE (SAMR 
infrastructure) and Cactus (problem solving environment) packages, and shows a 3-D 
blast wave in the presence of a uniform magnetic field with 3 levels of refinement. Zeus- 
MP solves the equations of ideal (non-resistive), non-relativistic, hydrodynamics and 
magnetohydrodynamics, including externally applied gravitational fields and self-gravity. 

The top-right figure is taken from the IPARS oil reservoir simulator and shows 
the multi-block grid structure and oil concentration contours. The MACE (Multi-block 
Adaptive Computational Engine) infrastructure support multi-block grids where multiple 
distributed and adaptive grid blocks with heterogeneous discretization are coupled 
together with lower dimensional mortar grids. 

The CCA (Common Component Architecture) and GrACE application at bottom- 
left investigates the direct numerical simulation of flames with detailed chemistry solving 
the Navier-Stokes and species evolution equations without approximations. The figure 
shows this simulation for a mixture of H 2 and Air in stoichiometric proportions, with 3 
hot spots at 1000K causing PL-Air mixture to ignite and create many different radicals. 
The scientific problems being studied are the flame stabilization mechanisms of unsteady 
laminar and turbulent flames, with emphasis on the flame structure at the flame base. 

The bottom-right application simulates the dynamic response of materials, with 
the goal to develop a Virtual shock physics Test Facility (VTF) for a wide range of 
compressive, tensional, and shear loadings, including those produced by detonation of 
energetic materials. GrACE is the computational engine underlying the VTF. The figure 
shows the compressible turbulence simulation solving the Richtmyer-Meshkov instability 
in 3D (RM3D) using adaptive refinements. The Richtmyer-Meshkov instability is a 
fingering instability that occurs at a material interface accelerated by a shock wave. 


A Selection of SAMR Application Enabled 





Multi-block grid structure and oil concentrations contours 
I IPARS, M. Peszynska, UT Austin) 




Mixture of H2 and Air in stoichiometric 
proportions with a non-uniform temperature field 
(GrACE + CCA, Jaideep Ray, SNL, Livermore) 



Richtmyer-Meshkov - detonation in a deforming 
tube - 3 levels. Z=0 plane visualized on the right 
(VTF + GrACE, R. Samtaney, CIT) 


Figure 18 
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ARMADA: AUTONOMIC RUNTIME MANAGEMENT OF DYNAMIC 

APPLICATIONS 


ARMaDA is a framework for the autonomic run-time management and 
optimization for dynamic SAMR applications. Autonomic behavior is achieved by 
adapting SAMR application execution to optimize partitioning, load-balancing, and 
scheduling. Adaptation parameters include the partitioning scheme based on current 
runtime state (GrACE, Vampire, etc.), granularity/patch size affecting load balance and 
overhead, dynamic allocation of processors (from beginning or “on-demand”). Other 
optimizations include hierarchical decomposition using dynamic processor groups, 
communication optimization, latency tolerance, multithreading, etc. 

Autonomic application management involves system-sensitive and application- 
sensitive adaptation. System-sensitive application management uses current and predicted 
system state characterization to make application adaptation decisions. For example, the 
information about the current load and available memory may determine the granularity 
of the mapping of the application components to the processing nodes, while the 
availability and “health” of the computing elements on the grid may determine the nature 
(refined grid size, aspect ratios, etc.) of refinements to be allowed. 

Application sensitive adaptations use the current state of the application to drive 
the run-time adaptations. The abstraction and characterization of the application state is 
used to drive the resource allocations, partitioning and mapping of application 
components onto the grid, selection of partitioning and load-balancing algorithms and 
their configurations, communication mechanisms, etc. 


ARMaDA: Autonomic Run-time Management and 
Optimization for Dynamic (SAMR) Applications 


• Partitioning, load-balancing and scheduling of SAMR 
applications. 



Partitioning Scheme 

• “Best" partitioning based on application/system configuration and current 
application/system state 

- G-MISP+SP, pBD-ISP, SFC (Vampire, GrACE, Zoltan, ParMetis, ...) 
Granularity 

• patch size, AMR efficiency, comm./comp. ratio, overhead, node- 
performance, load-balance, ... 

Number of processors/Load per processor 

• Dynamic allocations/configuration/management 

- 1000+ processor from the beginning or “on-demand” 

Hierarchical decomposition using dynamics processor groups 
Communication optimizations/latency tolerance/multithreading 
Availability, capabilities, and state of system resources 

• SNMP, NWS 


Figure 19 
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ARMADA: AUTONOMIC RUNTIME MANAGEMENT 


Starting in the upper-left of the figure, the SAMR application is monitored by the 
V-Grid Monitoring Engine to enable the V-Grid Planning and Analysis Engines to 
identify natural regions and characterize application state. Simultaneously, the V-Grid 
Monitoring Engine also monitors and characterizes the system. The synthesized system 
capability combines monitored information with history and predictive models. Both of 
these characterizations flow into the V-Grid Analysis and Execution Engines. The V-Grid 
Analysis Engine deduces objective functions, strategies, and normalized work and 
resource metrics, using policies and constraints to navigate the decision space. The V- 
Grid Execution engine uses this information to autonomically partition or repartition the 
application into VCUs that are mapped and scheduled onto VRUs. Global-Grid 
Scheduling (GGS) is first used across VRUs and then Local-Grid Scheduling (LGS) 
within a VRU. The V-Grid Execution Engine then allocates and configures Grid 
resources and schedules execution of VRUs. This execution is, in turn, is monitored by 
the monitoring engine. This flow of events occurs within a distributed framework. 

A dynamic topology of V-Grid framework agents will locally monitor the 
application and resources. Changes in the local natural regions will be monitored along 
with changes in the local resource performance. The V-Grid Analysis Engine may be 
able to make many local decisions, but may also be able to make improved decisions by 
“comparing notes” with neighboring framework agents. The autonomic partitioning and 
scheduling may move work among agents or may acquire new resources and add new 
agents to the framework. 


ARMaDA: Autonomic Runtime Management 




Mappirig 

Distribution 

Virtual 

Redistribution 

Resource 


L Unit 



VGTS VGSS VGTS VGSS 

VGTS: Virtual Grid Time Scheduling 
VGSS: Virtual Grid Space Scheduling 


Figure 20 


74 




ARMADA: APPLICATION-SENSITIVE ADAPTATIONS 


The ARMaDA framework performs adaptive application- sensitive partitioning 
based on the input parameters and the application’s current runtime state. Partitioning 
behavior is characterized based on the {Partitioner, Application, Computer System} 
(PAC) tuple. Each PAC tuple is evaluated using a 5-component metric that includes load 
imbalance, communication requirement, amount of data migration, partitioning induced 
overhead, and the partitioning time. The PAC relationship is dynamic and the partitioner 
P is a function of the state of the application A and the computer system C at that time. 
The octant approach is used to classify application runtime state with respect to the 
adaptation pattern, computations/communications, and activity dynamics. 

The ARMaDA framework has three components: application state monitoring and 
characterization, partitioner repository and policy engine, and an adaptation component. 
The state characterization component implements mechanisms that abstract the current 
application state in terms of the computation/communication requirements, application 
dynamics, and the nature of the adaptation. The policy engine provides an association for 
mapping octants to partitioners and the partitioning repository includes a selection from 
popular software tools such as GrACE (ISP) and Vampire (pBD-ISP, GMISP+SP). 
Subsequently, the meta-partitioner or adaptation component dynamically selects the 
appropriate partitioner at runtime and configures it with associated parameters such as 
granularity. As shown in the slide, experimental results demonstrate the improvement in 
SAMR application execution using application-sensitive partitioning - 26.19% for 
VectorWave-2D application on 32 processors on Linux Beowulf cluster “Frea” and 
38.28% for RM3D application on 64 processors on IBM SP2 “Blue Horizon”. 


ARMaDA: Application-sensitive Adaptations 


• PAC tuple, 5-component metric p t 

• Octant approach: app. runtime state 

• GrACE (ISP), Vampire (pBD-ISP, 
GMISP+SP) partitioners 

• ARMaDA framework 

- Computation/communication 

- Application dynamics 

- Nature of adaptation 

• RM3D, 64 procs on “Blue Horizon” 

- 100 steps, base grid 128*32*32 

- 3 levels, RF = 2, regrid 4 steps 


ARMaDA evaluation for VectorWave-2D applica- 
tion on 32 processors on “Frea” 
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ARMADA: SYSTEM-SENSITIVE ADAPTATIONS 


The ARMaDA framework reacts to system capabilities and current system state to 
select and tune distribution parameters by dynamically partitioning and load balancing 
the SAMR application grid hierarchy. Current system state is obtained at runtime using 
the Network Weather Service (NWS) resource monitoring tool. NWS measurements 
include CPU availability, end-to-end network bandwidth, free memory, and the amount 
of space unused on a disk. System state information along with system capabilities are 
then used to compute the relative capacity of each computational node as a weighted sum 
of the normalized system metric. The weights are application dependent and reflect its 
computational, memory, and communication requirements. These relative capacities are 
used by the “system-sensitive” partitioner for dynamic distribution and load-balancing. 

The system-sensitive partitioner is evaluated using the RM3D CFD kernel on a 
32-node Linux-based workstation cluster. The kernel used 3 levels of factor 2 space-time 
refinements on a base mesh of size 128*32*32. System-sensitive partitioning reduced 
execution time by about 18% in the case of 32 nodes. The table in the slide illustrates the 
effect of sensing frequency on overall application performance. Dynamic runtime sensing 
improves application performance by as much as 45% compared to sensing only once at 
the beginning of the simulation. In this experimental setup, the best application 
performance was achieved for a sensing frequency of 20 iterations. 


ARMaDA: System-sensitive Adaptations 


• System characteristics using NWS 

• RM3D compressible turbulence 
application 

- 128x64x64 base (coarse) grid 

- 3 levels, factor 2 refinement 

• System/Environment 

- University of Texas at Austin (32 
nodes), Rutgers (16 nodes) 
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ARMADA: PROACTIVE MANAGEMENT 


The ARMaDA framework uses performance prediction functions to estimate 
execution time and application performance. Performance Functions (PF) describe the 
behavior of a system component, subsystem or compound system in terms of changes in 
one or more of its attributes. The PFs of each resource used by an application can be 
composed to generate an overall end-to-end PF that quantifies application performance. 

Performance functions model the application execution time for SAMR-based 
RM3D and describe overall behavior with respect to the computational load metric on the 
machine of choice (such as IBM SP “Seaborg” and Linux Beowulf “Discover”). The 
evaluation on IBM SP yields 2 PFs for small loads (<30,000 work units) and large loads 
(>30,000 units) respectively, whereas the Linux Beowulf produces a single PF. The error 
in modeling the execution time is low - 0-8% for IBM SP and 0-6% for Linux Beowulf. 

The PF modeling approach is used by the ARMaDA framework to determine 
when the benefits of dynamic load redistribution exceed the costs of repartitioning and 
data movement (if workload imbalance exceeds a certain threshold). A threshold of 0 
indicates regular periodic load redistribution while a high threshold represents the ability 
of the application hierarchy to tolerate workload imbalance. The RM3D evaluation on 8 
processors on Linux Beowulf cluster analyzes the effect of dynamic load redistribution 
on application recompose time for redistribution thresholds of 0 and 1. The application 
uses 3 refinement levels on a base mesh of size 64*16*16 with regriding every 4 steps. 
Threshold of 1 considers the costs of redistributing load and results in recompose time 
being reduced by half (improvement of almost 100%) as compared to a threshold of 0. 


ARMaDA: Proactive Management 


• Performance Function (PF) - behavior in terms of attribute changes 

• “Computational load” metric to model RM3D execution time 

• IBM SP “Seaborg” (NERSC) 

- PF S - small loads (< 30000 units), PF h - large loads (> 30000) 

- Error in modeling execution time is low (0 - 8%) 

• Linux Beowulf “Discover” (Rutgers) i® 

- Single PF PF = 

- Error in modeling execution time is low (0-6%) 

• Dynamic load redistribution for RM3D & effect on “recompose” time 

- 8 processors, base mesh 64*16*16, 3 levels of factor 2 refinements 

- Redistribution thresholds of 0 and 1 

- Thresh=1 improves recompose time by 100% compared to thresh=0 
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AUTONOMIC OIL WELL PLACEMENT 


The goal of this application is to dynamically optimize the placement and 

configuration of oil wells to maximize revenue. The peer components involved include: 

1. Integrated Parallel Accurate Reservoir Simulator (IPARS) providing sophisticated 
simulation components that encapsulate complex mathematical models of the 
physical interaction in the subsurface, and execute on distributed computing systems 
on the Grid. 

2. IPARS Factory responsible for configuring IPARS simulations, executing them on 
resources on the Grid and managing their execution. 

3. Very Fast Simulated Annealing (VFSA) optimization service based on statistical 
physics and the analogy between the model parameters of an optimization problem 
and particles in an idealized physical system. 

4. Economic Modeling Service that uses IPARS simulation outputs and current market 
parameters (oil prices, costs, etc.) to compute estimated revenues for a particular 
reservoir configuration. 

5. Discover Middleware that integrates Globus Grid services (GSI, MDS, GRAM, and 
GASS), via the CORBACog, and Discover remote monitoring, interactive steering, 
and collaboration services, and enables resource discovery, resource allocation, job 
scheduling, job interaction and user collaboration on the Grid. 

6. Discover Collaborative Portals providing experts (scientists, engineers) with 
collaborative access to other peer components. Using these portals, experts can 
discover and allocate resources, configure and launch peers, and monitor, interact 
with, and steer peer execution. The portals provide a shared workspace and 
encapsulate collaboration tools such as Chat and Whiteboard. 

(This slide is courtesy M. Peszynska) 


Autonomic Oil Well Placement 


Optimization algorithm: use VFSA (Very Fast Simulated 
Annealing) 

- requires function evaluation only, no gradients 

• IPARS delivers 

- fast-forward model (guess->objective function value) 

- post-processing 

• Formulate a parameter space 

- well position and pressure (y,z,P) 

Formulate an objective function: 

- maximize economic value Eval(y,z,P)(T) 

• Normalize the objective function NEval(y,z,P) so that: 


min Neval ( y, z, P) <=> max Eval ( y, z, P) 
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AUTONOMIC OPTIMIZATION OF OIL RESERVOIR 


These peer entities involved in the optimization process need to dynamically 
discover and interact with one another as peers to achieve the overall application 
objectives. The experts use the portals to interact with the Discover middleware and the 
Globus Grid services to discover and allocate appropriate resource, and to deploy the 
IPARS Factory, VFSA and Economic model peers ((1)). The IPARS Factory discovers 
and interacts with the VFSA service peer to configure and initialize it ((2)). The expert 
interacts with the IPARS Factory and VFSA to define application configuration 
parameters ((3)). The IPARS Factory then interacts with the Discover middleware to 
discover and allocate resources and to configure and execute IPARS simulations ((4)). 

The IPARS simulation now interacts with the Economic model to determine 
current revenues, and discovers and interacts with the VFSA service when it needs 
optimization ((5)). VFSA provides IPARS Factory with optimized well information ((6)), 
which then launches new IPARS simulations ((7)). Experts at anytime can discover and 
collaboratively monitor and interactively steer IPARS simulations, configure the other 
services and drive the scientific discovery process ((8)). Once the optimal well 
parameters are determined, the IPARS Factory configures and deploys a production 
IPARS run. 
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AUTONOMIC OIL WELL PLACEMENT 


The figure below show results from the autonomic oil well placement 
applications. It shows that the process converges to the optimal placement in 20 
iterations. 

(This slide is courtesy M. Peszynska) 


Autonomic Oil Well Placement 
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CONCLUSION 


The computational solutions addressed by the AutoMate project are based on 
fundamental innovations in the development, optimization and deployment of 
component-based Grid applications, thereby allowing the heterogeneity and dynamics of 
the applications to match that of the Grid and fully exploit its potential. These 
innovations will enable scientists to choreograph high performance, integrated end-to-end 
simulations that were never possible or attempted before. The key IT contributions are 
the methodology and associated technologies that enable the development of applications 
that can manage and exploit the dynamism and heterogeneity of the Grid, and that 
address the extremely serious problem of software complexity that is threatening both 
academia and industry. 

We currently have working prototypes of each of the components presented in 
this paper, and are in the process of integrating them to support autonomic structured 
adaptive mesh refinement applications (SAMR) in science and engineering. Further 
information about AutoMate and its components can be obtained from 
http://automate.rutgers.edu . 


Conclusion 


• Autonomic (adaptive, interactive) applications can enable accurate 
solutions of physically realistic models of complex phenomenon. 

- their implementation and management in Grid environments is a 
significant challenge 


• AutoMate provides key technologies to enable the development of 
autonomic Grid applications 

- ACCORD: Autonomic application framework 

- RUDDER: Decentralized deductive engine 

- SESAME: Dynamic access control engine 

- Pawn: P2P messaging substrate 

- SQUID: P2P discovery service 

• Application scenarios 

- V-Grid autonomic runtime management of SAMR applications 

- Autonomic optimization of oil reservoirs 



More Information, publications, software 

- www.caip.rutgers.edu/TASSL/Proiects/AutoMate/ 

- automate@caip. rutgers. edu / parashar@caip. rutgers. edu 
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THE NEXT WAVE OF UBIQUITOUS COMPUTING IN KNOWLEDGE 
ECONOMY: CHALLENGES AND OPPORTUNITIES 


In this presentation, I will focus on the organizational opportunities and 
challenges that ubiquitous computing brings to organizations. Whether organizations like 
it or not, a fundamental paradigm shift in organizational computing is taking place. This, 
along with changes in the society and economy in general, presents new opportunities 
and challenges to organizations that they’ve never faced before. 
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FOUR WAVES OF ORGANIZATIONAL COMPUTING 


We can think of four distinctive waves of computing paradigms in organizations. 
It starts with data processing era where mainframe computers were used to automate the 
back-office tasks. In late 70’ s and early 80’ s, this wave was replaced by Micro wave, 
represented by personal computers and end-user productivity software (Word Perfect, 
Lotus 1-2-3, dBase III+, and Harvard Graphics). The introduction of local area network 
and, later, the Internet, once again changed the nature of computing and took us into 
Network era. At this point, we are yet again experiencing a transition from network to 
ubiquitous wave. Each wave of computing not only represents more advanced and 
powerful computer hardware and software, but also changes of the strategic significance 
of IT in organizations. It is journey from the back-office to front-office. It is a journey 
from being utility to strategic assets. This trend will continue in ubiquitous wave. 
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CHANGES IN BUSINESS ENVIRONMENTS 


In order to properly understand the importance of ubiquitous computing, we need 
to put it into a context of the current societal and economical environments. First, it is 
knowledge economy where knowledge is the primary means to add and create values. In 
knowledge economy, being connected is more important than having possessions. In the 
industrial economy, physical products were to be purchased and owned in order to 
consume. In knowledge economy, consumers need to be connected to experience 
knowledge-based services. As such, physical assets are not as important as knowledge 
assets. Second, it is networked economy. The value is created not by a heroic individual 
or a single firm, but rather created by a community of distributed agents. This requires a 
fundamental shift in our thinking about organizing. Finally, we are facing a 
fundamentally different market with customers who grew up with computers and Nitendo 
video games. These global new generations of customers emphasize aesthetics and 
spontaneity in their consumer experiences. 


Changes in Business Environments 


> Knowledge economy 

S From possession to connection 

S From having to experiencing 

S From physical products to knowledge products 

S From physical assets to knowledge resources 

S Values are created through the integration of knowledge resources 

> Networked economy 

S From a single firm to a pack of runners 
S From an individual hero to a community of distributed agents 
S From stand-alone machines to socio-technical web 


> Different market 
S Nintendo generation 
S Spontaneity 
S Global 
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AN ENVIROMENT IN CHANGE 


Such technological and environmental changes, along with series of recent de- 
regulations, have created a new environment. In this new environment, traditional 
separate industries come together and compete in the same space. 


An Environment in Change 


Telephony Wireless Computers Content 



Technology Convergence Is Breaking Down Barriers Between 
Historically Separate Industry Segments 


CASE I Weatherhead 
School of 
Management 


Emerging Computing Workshop, NASA Langley 
03.19.2003 


Figure 4 


88 








DRIVERS FOR DIGITAL SERVICES 


These changes in the economy, society, and technology can be summarized as the 
emergence of digital services as core elements of economic activities in the economy. 
Such emergence of digital service is not just technical nor organizational. It is socio- 
technical shift in the society. This is far more significant and fundamental than the 
emergence of web-commerce. In fact, much of the prior technological innovations in 
organizations (such as e-commerce, business process reengineering, enterprise resource 
planning systems, etc) can be seen as fundamental basis for this unavoidable emergence 
of digital service economy. 


Drivers for Digital Services 



Easy to Use Digital Services 
on Intelligent Devices 
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THE NEW DIGITAL ENVIRONMENT 


Thus, in this new digital environment, there will be both technological push and 
market pull. While the emergence of knowledge economy demands the anytime, 
anyplace delivery service, the technology will be there to enable such ubiquitous digitial 
business transactions. Similarly, digital convergence will enable mass customization. 


The New Digital Environment 


Technology 


Market 
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TECHNOLOGICAL CHANGES 


All of these changes in the economy and society are fueled further by the 
relentless developments in technology in all areas. In particular, the next wave of 
organizational computings will be built on heterogeneous distributed computing 
infrastructure along with novel technology kernels in hardware, software standards, and 
telecommunication and network technologies. In the next few slides, I will examine three 
key drivers of this new technological environment. 


Technological Changes 


> Fast change in all computing technologies 

> Heterogeneous and distributed computing 

S Novel technology kernels (hardware, telecom, system software- standards) 
S Distributed system architectures (design, control, performance) 

S Heterogeneous interoperability (services, semantics, metadata, ontologies) 
S Key features: mobility, net-centric services, intelligent agents 
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THREE KEY TRENDS 


In short, the new organizational computing environments can be characterized 
with three key words: mobility, digital convergence, and mass scale. 


Three Key Trends 
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DIGITIAL CONVERGENCE 


The rapid penetration of digital technology in all form of computing and 
communication have been enabled by the dramatic reduction of the computing costs and 
the emergence of open standards and new chip designs. The digital convergence enables 
new forms of engagement with digital services and new services such as in entertainment 
and telematics areas. Often, these new services require integrations of services 
traditionally offered in separate channels. 


Digital convergence 


> Digital convergence: enabled by computing costs and chip design + 
open standards 

s New forms of engagement with digital services 
s New services (entertainment, telematics) 
s Integration of services (video+ data) Challenges 

> Challenges 

s Requires independence between the content and the medium (Ex: CNN service) 
s Requires miniaturization of devices 
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MOBILITY 


The second major characteristics is the mobility. We often focus on physical 
mobility as uni-dimensional concept. However, mobility can be divided into micro-, 
local-, and remote-mobility depending on the geographical coverage on the area that is 
covered. We also need to think of social mobility as well as physical mobility. In the past 
where particular computing activities were tied to particular time-space combination, 
social mobility was relatively stable— one social role in one physical place. However, as 
physical mobility of computing devices become higher, one can have a high degree of 
social mobility even within the same geographic location and temporal boundary. In 
order to support both physical and social mobility, organizations need to develop socio- 
technical ontology. 


Mobility 


> Mobility covers physical mobility and social mobility 
s Social: roles, capabilities, rights, preferences 

s Physical: micro mobility, local mobility, remote mobility 
s Requires mobility of services across platforms 

> Enables new services as combinations of social and physical mobility 
and independence between services and locations 

> Challenges 

s Interoperability and peer-to-peer synchronization becomes critical 
s Requires Dffl social ontology to support social mobility 
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MASS SCALE 


The combination of digital convergence and mobility lead to unprecedented large 
cale of deployment of various types of digital services at a global level. In this new mass 
scale era, new technical challenges emerge including scalability, reliability, complexity, 
security, and performance. New emergence of grid computing and autonomic computing, 
for example, will play vital roles to support such a mass scale. 


Mass Scale 


> Mass scale: services provided in principle at a global level, 
pervasiveness implies high volumes 

v Internet capable mobile devices: 1 billion by 2003 
v 300 million Bluetooth devices in US alone by 2003 
v PDA sales in US in 2000 was $1.03 billion 

> Challenges: scalability, reliability, complexity, security and performance 

> These are affected by both mobility (coverage, network features) and 
digital convergence (bandwidth, QoS) 
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UBIQUITOUS INFORMAITON ENVIRONMENT 


To summarize, a ubiquitous computing environment can be undrstood as a 
heterogeneous assemblage of interconneted technological and organizational elements, 
enabling both physical and social mobility of computing communication services 
between organizational actors both within and across organizational boundaries. The 
impact and challenges of ubiquitous computing need to be understood as an integral part 
of modern complex organizations as socio-technical webs of distributed intelligent 
agents. 


Ubiquitous Information Environment 


> A heterogeneous assemblage of interconnected technological and 
organizational elements, enabling physical and social mobility of 
computing and communication services between organizational actors 
both within and across organizational boundaries 

> The Theme of International Conference on Information Systems 2003: 
IT Everywhere 

> An integral part of modern complex organizations as socio-technical 
webs of distributed intelligent agents 
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MOVEMENTS TO UBIQUITOUS COMPUTING 


This figure shows the movement from traditional organizational computing where 
both mobility and the degree of embeddedness of computing in environments were low to 
ubiquitous computing where both of them are high. Further, it shows the conceptual 
differences among pervasive computing, mobile computing, and ubiquitous computing. 


Movements to Ubiquitous Computing 
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OPPORTUNITIES 


These technological developments can enable new and disruptive business models 
and opportunities. For example, Progressive insurance experimented with “usage-based” 
auto insurance policy premium model in Texas. The service was enabled by GPS chips 
with a dial-up modem installed in cars along with powerful database of past history. 
Customers were charged based on their actual driving patterns, rather than their personal 
profiles. Such a revolutionary product enabled by the combination of powerful ubiquitous 
computing tools can potentially cause dramatic disruption in the market. Organizations 
need to proactively seek to leverage this emerging ubiquitous computing tools in order to 
create this type of disruptive opportunities. 


Opportunities 


> New and disruptive business models are possible 

> Examples 

S Telematics 
S Home digital media 
S High-velocity coordination systems 
S On-demand distributed training and learning 
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CHALLENGES 


However, organizations need to overcome significant technical, organizational 
institutional challenges in order to take advantage of emerging ubiquitous computing. 


Challenges 


r Technical 
> Organizational 
r Institutional 
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INFRASTRUCTURE 


Technical challenges can be divided into two: infrastructure and services. For 
infrastructure, ubiqiutous computing means technically heterogeneous, geographically 
dispersed, and institutionally complex without centralized coordination mechanism. 
Thus, as pointed out earlier, providing interoperability, scalability, stability, reliability 
and persistence through infrastructure will be critically important. Furthermore, since 
ubiquitous computing involves many diverse devices, seamless integrations among 
different devices, services and platforms will become key IT management issues. The 
ubiquitous computing infrastructure need to provide location awareness, service 
availability, physical and social mobility, and social ontology. 


Infrastructure 


> Characteristics 

S Technically heterogeneous, geographically dispersed, and institutionally 
complex without centralized coordination mechanism 
S Challenges 

• Interoperability, scalability, stability, reliability and persistence 

• Seamless intergrations of heterogeneous devices, services, and platforms 

• Location awareness, service availability, physical and social mobility, and 
social ontology 
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SERVICES 


In ubiquitous computing environments, services need to be configured 
dynamically with varying lifecycles from manay different sources. Thus, personalization 
and mobility support will be important. In order to support these two aspects, content 
and medium need to be separated and infrastructure need to provide context awareness to 
the devices and services. 


Services 


> Services need to be configured dynamically with varying lifecycles from 
many different sources 

> Challenges 

s New services 
s Personalization 
v' Mobility support 

s Content and medium separation: Management of content and metadata 
management largely unresolved (despite XML) 
s Location and time (conext) awareness 
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ORGANIZATIONAL INNOVATION 


According to the history of technology-enabled innovations, most such 
innovations failed not because of the technological breakdowns, but due to the lack of 
complementary organizational innovations. The integration of both social and physical 
mobility will foster novel forms of social/technological innovation and demand new ways 
of organizing. However, often such new forms of organizing are hard to realize. Since 
the technology is often designed only based on past historical use, it is much more 
difficult to foresee what the future would look like and build new technology based on 
such visions. Thus, it is critically important to take co-evolutionary approach in building 
ubiquitous computing environments through experiments and trial and errors. 


Organizational Innovation 


> The integration of both social and physical mobility will foster novel 
forms of social/technological innovation and demand new research 
approaches 

> Driven by both grassroots experiments and novel theoretical and 
methodological choices 

> May demand a radical shift in the focus of research and associated 
business innovation 

> Focus on experimentation, learning from technology trials, novel 
theorizing 

> Designing unforeseen future using technology 
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ORGANIZATIONAL CHALLENGES 


Such experiments and co-evolutionary approaches will eventually help us 
understand many questions raised in this slide. One of the most critical aspect of 
organizational challenges is the notion of un-packaging and re-packaging of digital 
services based on innovative business models. In the past, particular services were tied up 
with particular physical products. Through digital convergence, however, digital services 
can be separated from the medium (physical products). Once separated, these contents 
and media can be re-packaged in different combinations enabling new and novel offering 
of digital services. Another critical challenge is to figure out how to make money out of 
this emerging digital service model. In particular, it is very difficult to understand the 
demand for this type of novel digital services. Also, we have very little understanding of 
how users consume information rich products. In order to provide offer this type of novel 
digital services, organizations will have to develop a completely new form of strategic 
alliances with companies coming from different and remote industries. 


Organizational Challenges 


> SERVICE CONCEPTS AND STRATEGIES 
S How to make money? 

S All information services may need careful rethinking and can be transformed 
S Design, management and (un-)packaging of digital services based on 
innovative business models 

S Understanding user needs in a new information rich environment 
S Understanding the demand and what drives it is difficult 
S Evolution and expansion of services based on user learning 
S Many services based on increasing returns and are community based 
S New forms of strategic alliances required 
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INSTITUTIONAL CHALLENGES 


Finally, institutional challenges are the most enduring and difficult challenges to 
overcome. It includes political regulatory challenges. This includes, frequency allocation, 
access technologies, privacy, security, and address and access types regulations. Often 
these institutional challenges are embedded into our everyday lives and hard to discern 
and predict. In order to fully realize the early promises of ubiquitous computing 
environments, technologists and organizational management alike need to be keenly 
aware of these institutional challenges. 


Institutional Challenges 


> POLITICAL AND REGULATORY CHALLENGES 

S diffusion demands coordination on institutional policies: frequencies, access 
technologies, privacy, security, addresses and access types 
S Stakes are high nationally and world wide- the race concerning the winners of 
the next wave of technological transformation will be furious! 

> Institutional barriers 

S regulatory regime: innovations in construction and architecture industry 
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CATCH THE WAVE! 


In order to pursue these emerging opportunities, more than ever, we need 
interdisciplinary approaches crossing the boundaries between traditional academic 
disciplines. Furthermore, we will need to combine basic research with application 
developments, because one leads to the other as discussed earlier. Certainly, the current 
institutional arrangement at many universities and research laboratories do not make such 
interdisciplinary research easy. 


Catch the Wave! 



Figure 21 
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GRID & AUTONOMIC COMPUTING -THE NEXT EVOLUTION 


Grid is a key part of IBM’s on demand strategy, a powerful vision for the 
computing enterprise. Our talk today is intended to focus less on the vision and strategy 
and more on what we have available to help solve your most pressing business and IT 
problems today. 





Grid Computing 
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CONTENTS 


Grid computing is an emerging technology, but Grid is already delivering real 
business value to customers today. This first section will focus on the major areas we see 
that taking place in and show you some examples of what we mean. 
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ON DEMAND OPERATING ENVIRONMENT ATTRIBUTES 


An On Demand operating environment has these four characteristics: 

1) Integration. . .IBM has the middleware products to deliver on this today. 

2) Open standards... Web services platforms and open architectures to 
enable rapid deployment and integration of business process 
applications. IBM embraces open standards across their product 
brands; this is a key differentiator 

3) Virtualized: now there’s an opportunity to virtualize the entire data 
center with the emerging technology of grid computing. 

4) Autonomic: for self-managing systems that include IBM’s Tivoli 
management software and DB2 data base with self-tuning and self- 
managing features 

Grid is a key enabler of the On Demand Operating Environment. 


Grid Computing 
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THE ON DEMAND OPERATING ENVIRONMENT 


Many new types of input devices and systems need to be integrated. This 
integration is based on open middleware 
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AGENCIES FACE CHANGING MARKET DYNAMICS 


The dynamics of the market has changed tremendously over the last few years. 
Users need technology to adapt to them rather than the users having to adapt to the 
technology. Unpredictable fluctuations in the market cause uncertainties that business 
needs to be able to have a flexible environment to handle the dynamic changes. 
Collaboration among departments, intra-agencies, and globally are becoming more 
essential than ever before. 



Grid Computing 
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GRID COMPUTING DEFINED 


Grids create a virtualized data center: Grids tie together resources across 

geographical boundaries, organizational boundaries, and system types. 
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VIRTUALIZATION TECHNOLOGIES TODAY 


Virtualizing technologies are not new. We have been using these technologies for 
some time and are now implementing them across heterogeneous environments. 


Grid Computing 
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USES OF GRID TECHNOLOGY 


There are four models of use with grid technology: Processing grids aggregate 
the power of heterogeneous servers or desktops to take advantage of unused cycles. Data 
grids enable disparate data sharing and collaboration across virtual organizations for 
intelligent decision making using dispersed data and multiple data formats. Resiliency 
grids enable continuous business operations in case any system in the grid should become 
enabled due to unplanned or disastrous events occurring. On demand uses the grid 
architecture and infrastructure to provide a utility for compute resources. Through 
metering and billing, users will be charged for what the resources they use and charged 
appropriately. . .very similar to electricity and water utilities today. 
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GRID DEPLOYMENT OPTIONS 


Customers are deploying grids in many different ways: Intragrids are within a 
company’s firewall to enable inter-department collaboration and sharing of resources. 
Extragrids connect companies with their suppliers and partners. Intergrids enable 
collaboration across multiple agencies through the internet. Many researchers in 
universities begin deploying grids in this manner to enable research data to be shared. 
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GRID ADDRESSES THESE NEEDS 


Grid is delivering real value to businesses today. 

Companies are using Grid today in three general areas: 

1) To improve utilization of computational resources 

2) To enable collaboration between organizations, and 

3) To improve the flexibility and resiliency of computing infrastructures. 
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■Infrastructure Optimization 

■Workload Management and Consolidation 
■Reduced Cycle Times 

•Increased Access to Data and Collaboration 
■Federation of Data 
■Global Distribution 

■Resilient / Highly Available Infrastructure 
■Business Continuity 
■Recovery and Failover 
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LOW INFRASTRUCTURE UTILIZATION 


One of the main drivers for grid computing is the ability for organizations to do 
more with their currently owned assets. Typically mainframes do a good job at 
maximizing utilization. Grids can maximize utilization of UNIX and Intel-based systems 
and can aggregate the collective processing cycles that can work on jobs that were not 
viable before. 



Grid Computing 


Low Infrastructure Utilization 



Peak-hour Utilization Prime-shift Utilization 


24-hour Period 
Utilization 


Mainframes 

85-100% 70% 

60% 

UNIX 

50-70% 10-15% 

<10% 

Intel-based 

30% 5-10% 

2-5% 

Storage 

N/A N/A 

52% 


Source: IBM Scorpion White Paper: Simplifying the Corporate IT Infrastructure, 2000 
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RESILIENT/ HIGHLY AVAILABLE INFRASTRUCTURE 


Another significant motivation for employing Grids is the need to reduce the time 
that it takes to complete a particular computation. Often this need is a critical part of the 
value proposition for a particular business function. 

For example if a particularly compute intensive task must be accomplished in a 
short span of time opportunities for parallel execution of parts of the calculation can be 
exploited to complete the job more quickly than if the entire calculation was performed 
serially. In fact in many cases sequential execution of the problem might take so long as 
to render the final result unusable. 

In another case, advantages may be gained by running a particular computation 
more often. For if example airline pricing and load management algorithms, which are 
fairly complex calculations, can be completed more rapidly they can be executed more 
often allowing the company to respond more rapidly to changing market conditions and 
better utilize its planes, personnel, and fuel resources. 

The animation above shows three jobs that are scheduled to run on three different 
servers. During the course of running Job 1, the server it is running on has an outage. 
This might be a Sun server which fails, or it might be a server going down for scheduled 
maintenance. Its really not important what the reason is. Using Grid middleware, a 
scheduler can detect that Job 1 did not complete and reschedule that job to run on another 
available computing resource. This ensures that all critical tasks are completed. 

Most corporate computing users don’t care where their application runs. They 
want good performance and they need their data to be secure. Using intelligent 
scheduling middleware, a company can utilize the most available, appropriate asset to run 
a given task. It might be the case that during the trading day in New York, the banks data 
centers in Tokyo or London are idle. The employees there are home sleeping. The 
middleware can schedule jobs to run in the overseas data centers improving performance 
of the application and off-loading workload from New York, a win-win scenario. 
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THE VALUE OF OPEN STANDARDS 


None of these models will happen without open standards. I think we have seen a 
pretty clear path over the last 10-15 years of open standards based computing starting all 
the way back with networking. There were other protocols that came out for networking. 
SNA, NetBIOS, etc... but people rallied around TCP/IP. It became an open standard 
approach to be able to take many different computer types and allow them to 
communicate over a network. 

From a communications perspective, we started to see e-Mail packages emerge 
and we now have standards like SMTP, POP3 and MIME. MIME was a very important 
standard that allowed different e-mail packages to be able to communicate with one 
another, and standardized how attachments were handled. 



The Value of Open Standards 
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THE VALUE OF OPEN STANDARDS - (CONTD.) 


From an information perspective we saw the World Wide Web emerge... again 
using new protocols and open standards like HTML and HTTP. J2EE has emerged as a 
standard for the Web infrastructure to communicate with backend transactional systems, 
your corporate data. Operating systems, the penguin - LINUX has become very popular 
around the world from a server infrastructure perspective. New feature and functions are 
being added to LINUX around the world at an incredible pace. The open source 
community and companies like IBM are participating in this open movement. Because of 
this LINUX has become a very popular operating system for server environments and 
becoming almost a defacto-standard as an operating system for servers. From an 
application perspective, we have the web services standard emerge, focused on SOAP as 
the transport layer, WSDL as the web services definition layer and UDDI as the directory 
of web services. Web Services is all about hooking up applications and making 
application to application communications simpler for developers within an enterprise. 
Developers can now quickly find web services and assemble them into applications - 
again, use of open standards driving more value to the business. 

And today we are talking about distributed computing - grid computing. And yes 
again we have a standards body and a process of working with an open standards based 
community (Global Grid Forum - GGF). IBM is working on the standards for distributed 
computing with this group. The technology we are all developing is called OGSA, which 
stands for Open Grid Services Architecture. 

If you look at these standards, it is pretty easy to come to the realization that 
OGSA will be to grid what TCP/IP was to networking, what HTML/HTTP were to the 
Web. If you want to build a grid of distributed systems & distributed resources you are 
going to need OGSA on all those platforms and resources within your environment. This 
is very similar to networking. If you are going to build a network of many different 
platforms, you need TCP/IP on those platforms for them to communicate. 
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GRID MIDDLEWARE TODAY 


The world of Grid middleware today is very much similar to the early days of 
networking. These are some examples of Grid middleware. We are working with all of 
these in various engagements. Each, on its own, has some excellent technical capabilities. 
But today it is not possible to use the workload scheduling capability from one product, 
the data management from another and the systems management from another and have 
confidence in the interoperability of the solution. These are essentially proprietary 
solutions today as no standard exists yet. 



Figure 14 
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OPEN GRID SERVICES ARCHITECTURE (OGSA) 


The Open Grid Services Architecture, or OGSA, will change that. Defined by the 
Global Grid Forum (or GGF), in which IBM is playing a very active role, the OGSA will 
be the standard protocol for Grid computing. 

The Globus Project is an open source implementation of OGSA (based on the 
GGF specification) and a toolkit (Globus 3.0) that provides a set of APIs to implement 
Grid applications. 

IBM, and other leading vendors are all sponsors of GGF. As I will show you in a 
moment, IBM is OGSA enabling all of our related products. Also, all of the middleware 
providers that I showed you in the previous slide are committed to OGSA and will 
implement it in their products. 

OGSA will be the TCP/IP of Grid computing. It is the common protocol that all 
computing resources must support to join and interoperate in a Grid. All Grid related 
middleware will support this standard allowing for interoperability of Grid solutions. 



Figure 15 


124 





ARCHITECTURE FRAMEWORK OGSA STRUCTURE 


So now let’s talk about the architecture related to OGSA and Grids. 

First, all computing hardware devices that wish to participate in a Grid will be 
OGSA enabled. This includes servers, storage and I/O devices. And IBM is enabling all 
of our e-Servers and TotalStorage products. 

Next is the general middleware layer where many/most databases, file systems, 
security services, etc, will be OGSA enabled. 

Next is the Web Server engine, the container for the OGSA functionality. There 
will be many implementations of OGSA carried by open source implementations like 
JBOSS and by products such as WebSphere, IBM’s strategic web engine. 

OGSA, the Open Grid Services Architecture — is being written as J2EE and it will 
be based on web services. This is a very important point. The developers that are 
working on the open grid service architecture decided not to recreate the world and they 
decided to base their work on another standard that is available today - web services. 
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ARCHITECTURE FRAMEWORK OGSA STRUCTURE (CONTD.) 


It is a standard that has been driven for the applications developers - to make 
application development and application integration easier - web services is a perfect way 
to implement the open standard based grid protocols. 

Sitting on top of this stack are the applications that will exploit this functionality. 

This function will require autonomic functionality in the infrastructure to keep 
devices available. We also see a big role for services as the fact is that today grids are 
built, they are not bought. We believe that our experience and skills participating and 
helping to build the most significant grids in the world are an important core competency 
that IBM brings to the table with our clients. 
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ARCHITECTURE FRAMEWORK PRODUCTS AND SERVICES FOR GRIDS 


Here you see the total stack populated with the types of products we expect to see 
running in the Grid world in the future. Please note that there are no commitments from 
these application IS Vs at this time, but I want you to understand that Grid is not just 
about HPC applications. These capabilities will open up important functions for 
mainstream business applications as well. 

Some of IBM’s key Grid partners are in this picture - Platform, Avaki. 
DataSynapse, Entropia, United Devices, and the Globus Project. Today they have 
software - grid middleware that allows customers to build grids. In the future they will be 
recasting their products to work on top of OGSA. What they have all realized is that we 
don’t need 7 or 10 proprietary ways of building grids in the world - just like we did not 
need 7 to 10 ways to do network in the world. We need one open standard way that all 
customers can depend on, a standard that allows Grid IS Vs to provide higher level grid 
services. Customers and application IS Vs can be assured that there is one open standard 
way of building and deploying these services, this is what OGSA is all about. 
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ARCHITECTURE FRAMEWORK PRODUCTS AND 
SERVICES FOR GRIDS - (CONTD.) 


As you see in the middle of this picture, OGSA will need an application server 
and Web services engine. At IBM we are very excited about this architecture because we 
feel we have developed and continue to improve on the best web services engine in the 
world - WebSphere. We expect to make many enhancements to our WebSphere product 
with respect to web services. We expect to be able to run web services better than anyone 
in the industry on multiple platforms. We intend to provide the highest level of resiliency 
in the industry, and the highest Quality of Service for web services. We have been told by 
our customers that supporting multiple platforms, resiliency, QoS, and open standard are 
some of the most important things they want in their IT infrastructure. We will deliver 
Grids through WebSphere. Tivoli products will be enhanced and focused on grid 
deployment and management, and our storage and database products are being enhanced 
to support Grids. 

As these architecture shows OGSA will be used an this open standard based 
protocol that will support multiple servers, operating systems, storage & data systems in a 
very resilient fashion. This is the architecture of grid and of future IT environments. 
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AUTONOMIC COMPUTING 


Why is Autonomic Computing important in a Grid environment? To net it out, it 
means that systems are self-configuring, self-healing, self-optimizing and self-protecting; 
it means that systems do the work, freeing IT professionals to focus on other critical 
business needs. 

■ Autonomic computing systems are self-configuring, self-healing, self- 
optimizing and self-protecting. 

■ Self-configuring systems increase IT responsiveness/agility 

■ Self-healing systems improve business resiliency 

■ Self-optimizing systems improve operational efficiency 

■ Self-protecting systems help secure information and resources 
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AUTONOMIC EXAMPLES 


How does Autonomic computing fit into Grid? 

Autonomic capabilities are found in Grids today as they are already available in 
many IBM products today. Including. . . 
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TOWARDS AUTONOMIC COMPUTING 


OGSA enables an open integrated infrastructure to be built. 
Grid computing means accessing and sharing the infrastructure 
Autonomic helps in managing the infrastructure 
Web services provide consistent interfaces and taxonomy 
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GRID ADOPTION CURVE 


The early adopters of grid computing began in universities and research scientists 
who needed more and more compute power and couldn’t afford the cost of the 
supercomputers to do their work. Aggregating the capacity of multiple computers 
provided an answer to their problems. We continue to see every industry building grids 
today and learning about the business value that grids bring. While high performance, 
numeric intensive environments were the early adopter application drivers, commercial 
applications are returning significant ROI for businesses today and the trend will continue 
to increase. 
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INDUSTRY APPLICATIONS 


Grids are being built in every industry today. Some of the key applications within 
these industries are shown here. 



Grid Computing 


Industry Applications 


Unique by Industry with Common Characteristics 


Manufacturing 


Grid Infrastructure 


Gov’t & 
Education 



Primary Focus 


Figure 23 


133 




SCIENTIFIC GRID EXAMPLES 


Some examples of grids are shown but more information on each can be found on 
our website at www.ibm.com/grid 
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R&D GRID: THE TERAGRID 


An example of a large grid implementation is the TeraGrid at NCSA, San Diego 
Supercomputing Center, Argonne National Lab, and CalTech. 
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COMMERCIAL EXAMPLES 


More examples showing commercial application use. IBM is ‘eating our own 
cooking’ by using grids in many areas of our business. 
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IBM 


IBM has all of our research centers around the world on an intragrid as well as 
using grids in our manufacturing plants, benchmarking centers, and design centers. 
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BUTTERFLY.NET 


Butterfly.net is a commercial application of 
value for a company in the video gaming business. 


a grid showing significant business 
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eDIAMOND 


The UK eScience 
research and shortening the 


initiative is designed to tackle issues dealing with cancer 
time to find cures. 
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IBM GRID OFFERINGS 


IBM has announced several Grid Offerings. Targeted toward industry segments, 
the offerings provide options in middleware, hardware accelerators, etc to meet the needs 
of each customer’s business problems. 
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Government Grid offering: 
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IBM GRID FOCUS AREAS 


Grid is most often being implemented in these 5 areas: 

Government Development 
Enterprise Optimization 
R&D 

Engineering & Design 
and Business Analytics. 

IBM has 10 offerings available today in these 5 focus areas for select industries: 
Life Sciences 
Financial Services 
Aerospace 
Automotive 
Government 
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GRIDS DELIVER BUSINESS VALUE 


Grids can deliver real business value today. Even though the standards continue 
to evolve and the journey to on demand computing may take a few years to mature, 
building the infrastructure today is key. 
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GRID COMPUTING - 

- CHANGING THE IT INFRASTRUCTURE 


Steve Salkeld 
Platform Computing 
Brampton, Canada 
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Platform 
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TODAY’S BUSINESS CLIMATE 


Adopting to the changing market place demands have left businesses challenged 
to rethink their approach to IT. The spending approach that was identified with the dot- 
com era or pre-recession times is no more. The investment decisions must be made 
carefully and in a real context of the demands of the business. It is no longer acceptable 
for IT to manage boxes - 99.99% uptime for networks is useless if the database the 
application depends on is down. Providing the complete view of the servers, applications, 
web services and data along with the people who are using them is increasingly the 
minimum stakes for IT. The under-current from the past few years have lead to the 
financial arm of the corporation expecting more then “hand-waving” ROI. It bust be clear 
and linked to the business. The one constant in today’s marketplace is change. It falls on 
IT to adapt to the changing demands of business units, economic trends and strategic 
decisions made by the corporation. 
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IT DRIVERS 


These are the IT drivers, in fact the corporate drivers that continuously come up 
with the customers we have spoken with are as follows: Planning - IT investment 
decisions are often made in a vacuum with little context of past performance. For 
example, the switch from Solaris to Linux may save upfront capital expenditure, but how 
will the applications perform? Under normal load? Under production load? During peak 
demand? What about provisioning for peak loads? Is it the right methodology? 

Server Consolidation - is a practice driven by merged operations, the demand to 
simplify the management of distributed operations. Looking at the Life Sciences market, 
the global merging of corporate entities means overlapping IT and business units. The 
ability to successfully implement a consolidation program is predicated on understanding 
where to make the best, most effective changes. 

Business Continuity - both availability and service levels underscore the need for 
the consistent delivery of IT services. By understanding the full breadth of services in 
terms of the time of day, key service windows, and holistic view of all the components of 
the services. 

ROI - By providing clear ROI, in terms of cost, productivity and value during the 
key windows of corporate performance - market open, product data management load 
time, B2B uptime will provide a real-world accountability for these services. 
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Platform 
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WHY ARE WE HERE? 


GM has architected a new system that will enable GM to achieve its enterprise 
management goal. The NGM team has proven that it can deliver it! 


‘Doing more with less’ requires an understanding what your resources are doing, 
and howto most effectively share and optimize the use of those resources 


A large auto manufacturer says 

“ In order to convert [enterprise management] 
data into information, and transform information 
into knowledge, one must first truly know the 
circuit being managed. Automating the 
documentation of the physical 
architecture, logical architecture, and 
detailed bill-of-materials is the first step to 
understanding where you’re at, so that you can 
begin to get where you want to go with 
managing distributed computing assets” 
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Platform 
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GRID EVOLUTIONARY NOT REVOLUTIONARY 


When you look at deploying Enterprise Grid capability, we believe is Grid is 
evolutionary NOT revolutionary - you need to take steps to implement the technology: 

1 . First, you need to connect the assets. This enables you to see what you 
have, where you can first deploy grid technology, and allows for distributed processing. 

2. You can’t manage until you measure, so the next step is to understand the 
drivers - who are using what, at what time of day etc. Understand the dynamics of how 
your resources are being utilized. This is a measurement exercise. 

3 . From that understanding, you are now in a position to manage and 
provision the resources more intelligently. This is done with business policy engines, 
self-healing/HA software and intelligent resource provisioning. 

4. Once we are managing, we are now in a better position to now tune the 
infrastructure based on user demand. - start getting smart about your work in the context 
of business priorities. This is driven by tight integration with the user applications. 

As you build up the Grid framework you will notice that the business silos 
disappear - as we connect and optimize, you are using everything you have in a much 
more effective way - increasing collaboration, utilization and delivering better ROI to 
users and delivering a much better return on the IT asset. 
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GRID FABRIC: DC INFRASTRUCTURE FOR CLUSTERING AND GRID 
(COMES WITH ALL OF OUR PRODUCTS) 


■ Resource Agent : Functions to gather information and operate on any type of 
resources, agent framework for new agent development and plug-in 

■ Communication Backbone : scalable, reliable, efficient and extensible infrastructure 
to collect resource data and execute actions across grid 

■ Distributed Task Execution: facilities to perform user jobs and management tasks 
on any devices and resources across grid 

■ Central Management: creation of a “virtual mainframe” infrastructure 

Performance Management: Measure and analyze system and application 

performance against business requirements: 

■ System and Application Metrics: Key performance indices related to business 

■ Grid Reporting: performance reporting, resource accounting and charge back 

■ Grid Planning: bottleneck identification, capacity planning, workload policies 

■ Management Portal: for transparent, secure access to grid management info 

Service Management: Manage resource supplies to deliver to the most critical work 

activities: 

■ Self-Healing Management: Automation of administrative tasks to keep systems and 
applications in working order and reduce admin costs 

■ Service Provisioning: Dynamic allocation and aggregation of resources for the 
most important work and services according to policies and in response to changes 

■ Failover and HA : Detect service failures & dynamically switch them over to other 
available resources or sites 

■ Service Aggregation: Compose higher level services from other services; services 
supported by multiple instances in a data centers or across grid 

Workload Management: Effective processing of various types of user work 

activities: 

■ Distributed Batch: Effective processing of non-interactive jobs across grid 
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Flow scheduling: construction, organization, sequencing and staging of related jobs 
into operationalized flows according to dependencies and calendars. 

Distributed Messaging: Messaging workload processing across grid 

Session Load Balancing: Scheduling of interactive session-based applications onto 
servers across grid 
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Platform 
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Platform Products 
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PLATFORM INTELLIGENCE 


Here is an example of the types of business questions that can be answered with 
Platform Intelligence. 

By geographic location or business unit recognize how systems and servers are 
deployed and used. 

What are the utilization patterns across the whole infrastructure? Where is there 
head room? Can 20% of the services be reallocated on to existing servers? 

What the cost of over utilized licenses? Are they performing productive work or 
more importantly, the productive work critical to the success of the company? 

Where am I getting the best cost performance ratio? Is it on large SMPs? 
Desktops? Linux blades? What value would be placed on knowing these details? 


Platform 


Platform Intelligence 

Typical questions answered 

• What is the resource usage by type across my different 
lines of business? 

• What is the over all ut ilization of my infrastructure? 

• Is the infrastructure l oad balanced across entire hardware 
footprint? What specific types of servers are not 
balanced? 

• Who are the top 10 users of my expensive license 
resources? 

• What machine type has best cost/performance for each 
specific workload type? 
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HOW IT WORKS 


Platform Intelligence provides unique value by adding in the real world use of the 
systems and applications. By folding in areas such as locations, business units, projects 
and the people making use, it delvers real added value on top of the rich metrics 
concerning applications and systems. 

Transforming, aggregating and building the data warehouse consistently on a 
daily or even hourly basis is a great challenge. With Platform Intelligence the steps of 
driving this data into the database and generating the resulting reports and OLAP cubes 
has been deeply enhanced and fine tuned. This allows Platform Intelligence to scale 
beyond 1500 hosts each tracking 20 metrics as is the case in our QA lab. Scalability is 
our business, so the boundary conditions are huge 10’s of millions or row databases 
needing to be transformed for presentation. 

The resulting OLAP analytics and reports are updated automatically using built in 
mechanisms. The data is refreshed constantly, so viewing and interacting with the data is 
always the most current and the most timely. 

ROI = use + asset value + business objectives 
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PLATFORM INTELLIGENCE 


The layered architecture builds on the traditional view of business intelligence. At 
the bottom level, data collected, or extracted from correct sources. It is presented to the 
operational data store, with the metadata intact resulting in guaranteed consistency. By 
performing the required aggregations, the data is reduced in size, where possible without 
losing any insight. By performing these operations, the resulting visual tools will be 
smaller, load quicker and perform better. 

From the operational data store (ODS) the OLAP cubes and reports are built. 
These multi-dimensional viewers provide a means to interact with the focused subject 
areas that are of most interest. Be it project, licenses, workload or IT performance are 
pre-organized to speed the access the the greatest value. By linking the cubes it is also 
possible to identify an area of focus and follow it through to another OLAP 
multidimensional view. 

Users interact with Platform Intelligence through a management portal that is 
personalized for the individual user. Be operational administrator or executive, each can 
have a view tailored to their unique informational needs, that is also controlled by the 
user in order to continue to the modifications. This portal is based on Internet technology 
and supports secure access in a zero foot print fashion from any access point across the 
enterprise. 


Platform 


Platform Intelligence 


IT Management Portal 
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Data assets through Grid Fabric 
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Platform 



Day (7:00 -15:59) 
I BackOffice 
I DailvRisk 
I EndofDav 
I OIC 
□ Other 


Platform SiteAssure 

• Customizable policy-based automated control 
system 

• Passive and active agents driven by rules 
engine 

• Support dynamic service provisioning 

• End-to-end resource management solution 
optimizing availability & performance of 
services 



Figure 19 
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Platform LSF 

“I want to make sure our users’ work is processed 
reliably, timely, and easily - 1 want them to focus on 
their work, not IT” 

• Work Load Management optimizes the productivity of the 
computing environment whereas Service Management 
manages computing supply 

• Our value is distributed scheduling, reliable management, 
dynamic service allocation, heterogeneous resource 
matching, and application integration 

• Enterprise workload requires more than a load balancer 
switch 


Figure 22 
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OPEN, SCALABLE WORKLOAD MANAGER 


Scalable, Grid-enabled to enable extensibility and customizability to meet 
growing needs of users. The mbatchd is splited up into two processes: a manager 
process, and a scheduler process. The scheduler is further modularized into a number of 
plug-ins. The scheduler process loads a number of scheduler plug-ins corresponding to 
specific policies. Note that MultiCluster is handled as just another scheduler plug-in. 
The Manager and Scheduler communicates over a socket. The separation of the mbatchd 
into two processes enhances performance as it allows the scheduler to focus on 
scheduling while the manager can do the overall coordination, including handling client, 
LIM, and sbatchd interactions. Additionally, the scheduling data structure and algorithm 
has been restructured so that the scheduler is resource-centric; it maps resources to 
available jobs rather than vice-versa as was the case in 4.x. This means that as long as a 
resource is available, the scheduler will schedule the next waiting job. This improves the 
performance when there are large number of jobs in the system significantly as the 
scheduler does not need to go through a long list of jobs. The scheduler plug-in API is 
unique in that multiple plug-ins can co-exist at the same time, and can complement each 
other’s policies. This is in contrast to SGE/PBS where the scheduler must be entirely 
replaced. A site can write additional scheduling policies and simply plug into LSF to 
complement the existing LSF policies. An LSF Web Service Broker is introduced to 
support SOAP/XML interface into LSF. This is consistent with our standards-based 
direction. The SOAP/XML interface means users can access LSF functionality (e.g., 
submit a job) programmatically in a platform-independent manner. The Web GUI is built 
using the LSF Web Service Broker. The GUI uses the .Net infrastructure. 


Open, Scaleable Workload Manager 

Open, modular plug-in schedulers scale 
with the growth of your business 
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MULTICLUSTERS 


With this Figure we want to highlight our complete end to end capability with 
Grid Computing to clusters to subclusters of desktops. NO ONE else has this complete 
solution for Life Sciences customers. With the growing computational requirements it is 
necessary to leverage all resources across an org. 



MultiCluster 
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SubCluster 
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ABOUT OUR PRODUCT SUITE 


In many cases we see companies consider the ‘workload management’ component 
of grid technology. While clearly the ‘bread and butter’ of grid computing, we consider 
three dimensions to a successful grid implementation: The tools to build, run, and 
manage a grid environment. 

The development environment considers the tools required to build grid 
applications. In our product suite we have IDEs as well as GUI -based grid workflow 
processing design tools. 


Platform 


Platform’s product suite provides the tools to develop, execute, and 
operate the Grid Office and the Compute Backbone/Grid 
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Platform 


Industrial Manufacturing 



Figure 26 


Platform 


Case Studies 

• TACC 

• ASCI 

• TRW 

• SHARCNET 

• ENEA 

• Pharmacia 

• Hokkaido University 


Figure 27 
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Platform 


Texas Advanced Computing Center (TACC) 


Leader in inter-Grid collaboration 
TACC is building a University-wide Grid to connect 
clusters, workstations, visualization systems, and 
storage devices 

Accelerate collaborative computational science and 
engineering on Grids 

Enable researchers to share HPC resources & 
execute codes across multiple HPC systems - Grid 
of Grids 

Platform LSF 
Platform MultiCluster 
Platform Globus 

Web-based Grid portal simplifying use of many 
HPC systems 

Interoperability between local, state, national and 
global Grids 


"By partnering wth 
industry leaders like 
Platform, we hope to 
accelerate the 
collaborative nature of 
science on grids. Because 
TACC can leverage 
Platform's work wth 
Platform Globus, Platform 
LSF, and Platform 
MutiClusteratthe outset, 
we can focus our attention 
on the cutting edge of 
enabling science. 

Together ; we can leverage 
existing research 
intiatives and develop 
new solutions rapidly. 
Much of the work we do 
today in universties is 
what industry will demand 
tomorrow." 

Jay Boisseau, 

Director, TACC 
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Platform 


Seismic LSF Leasing Job Submit 




This method leases processors from padre if the job needs additional processors. 
Brazos has 2 processors and up to 4 can be borrowed from Padre. 


LSF Leasing Job Submission 

Job Name: 

Seismic_17-1 1-2002,19:50 

Parameters. 

fx:|o! dx:|6.0 

fz:[<T dz:|6.0 

xs:|0.0 zs:|0.0 

tmax:|4 0 mt:|l00 

fpeak:|20.0 fmax:|40.0 
verbose:fi hsz:|0.0 

Machine: 

Brazos (UT Austin) 

Number of Processors 

IJLJJ 

Directory for Results: 


Output File: 

1 17-1 1-2002, 19:50 out 

Note: This may take a few minutes to run. 

[ Submit LSF Job| 


TA£<£ 
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ERA RIO 


P" High Performance Computing Across Texas 


|lk. 


THE UNIVERSITY OF TEXAS AT AUSTIN 


Texas Advanced Computing Center 
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Platform 


ASCI (Accelerated Strategic Computing 
Initiative Grid) - Sandia National Laboratory 


ASCI securely connect 3 geographically 
dispersed U.S. Department of Energy Labs 
• Sandia National Labs (Albuquerque, NM) 

. Lawrence Livermore National Lab (Livermore, 
CA) 

. Los Alamos National Lab (Los Alamos, NM) 
Required Kerberos-compliant Grid solution 
Platform Globus with Kerberos security 
Kerberos-compliant Partner Grid 


"Platform Globus offers us 
numerous benefits, such as 
multi-platform commercial 
technical support, quality 
assurance and cost 
effectiveness. We made 
in tial steps in hardening 
the Globus Toolkit for use 
wth Kerberos, and are now 
collaborating wth Platform 
to enhance the initial 
integration." 

Steven Humphreys, 

ASCI Grid Services Project 
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TRW 


Leader in advanced technology for aerospace, 
information systems and automotive markets 
worldwide 


Increase license availability across multiple 
geographic & determine actual use per user 


Platform Global License Broker 
Platform Intelligence 


Transparently share software licenses across multiple 
geographic locations without manual intervention 
Expect to reduce additional license purchases 



"Wth this solution we 
can 

pre-empt idle licenses 
from interactive 
sessions, and track 
usage to determine 
actual use peruser- all 
wthout human 
intervention. ” 


Al Danial, 

Manager, Engineering 
Applications, 
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Platform 





SOUTHWESTERN ONTARIO 
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University of 
Guelph 
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Waterloo! 


Wilfrid Laurier 
University 


Ultra high speed 
fiber optic networking 
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Wind soi 
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University of 
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PARALLEL COMPUTING 
& IMMERSIVE GRAPHICS AT ENEA 


Figure 35 
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(San Francisco) 
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HOKKAIDO UNIVERSITY COMPUTING CENTER IN SAPPORO, JAPAN 


The budding grid environment is based on the "e-science" concept of 
collaborating on and sharing research electronically, as recommended by Japan's Ministry 
of Education, Culture, Sports, Science and Technology. 


Hokkaido University ^ 


.pjDOU/v/p, 







FOUNDED 187A 


Japan 

Platform 


National public university 

• First of seven e-science centers 

Scientific disciplines 

• Environmental studies 

• Nanotechnology 

• Observational astronomy 

• Post-genomic bioinformatics 

Grid-enabled visualization 

• SGI Visual Area Networking 

• Platform LSF 

• Platform Globus 

32-CPU SGI® Onyx® 300 visualization system 
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Platform 


Grid-Enabled Visualization 

• Scientists collaborate in real time 


Geography is not binding 






cientists don't need to relocate to the sit 
ompute, data or visualization services 


Required bandwidth is minimal 

hin clients move pixels, not data 


Security is simplified 




Authentication and authorization 




e-level encryption (e.g. via SSL 

\ni ftlMr- 


FV.1HH 
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Platform 



Figure 40 


Platform 


There are four key dimensions to consider when applying 
an enterprise grid solution 


Focus 


Benefits 


Enables 


Optimizes 


S imports 


t 

C 
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Business 

Process 

Applications 

Infrastructure 


Development of new strategy 
that changes the way people 
work and business gets done 

Modification of strategic 
business processes to leverage 
application optimization 


9 Focus on strategic task. 

9 New ways to do business. 

9 Process efficiency 
9 Reduce COGS 



3 O 

m u. 


Customization, configuration 
and optimization of ‘Grid 
enabled’ applications 

Transformation of core 
infrastructure into a virtualized 
computing environment 


9 Maximize software utilization 
a Reduce licensing costs 

9 Maximize hardware utilization 
9 Software cost reduction 
9 Streamline IT Maintenance 
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IPG POWER GRID OVERVIEW 


Thomas Hinke 

NASA Advanced Supercomputing (NAS) Division 
NASA Ames Research Center 
Moffett Field, CA 
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IPG POWER GRID OVERVIEW AND ACKNOWLEDGMENT 


This presentation will provide a brief overview of the Information Power Grid. 

I would like to acknowledge that many of the slides used in this presentation are 
based on a set of slides prepared by Tony Lisotta, for a grid tutorial that he recently 
presented at Global Grid Forum 7 in Tokyo. 


Information Power Grid (IPG) Overview 



Emerging and Future 
Computing Paradigms 
Workshop 

Thomas H. Hinke 
March 19, 2003 
Langley Research Center 
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OUTLINE 


This presentation will describe what is meant by grids and then cover the current 
state of the IPG. This will include an overview of the middleware that is key to the 
operation of the grid. The presentation will then describe some of the future directions 
that are planned for the IPG. Finally the presentation will conclude with a brief overview 
of the Global Grid Forum, which is a key activity that will contribute to the successful 
availability of grid components. 


Outline 


• What are Grids? 

• Current State of Information Power Grid (IPG) 

• Overview of IPG Middleware 

• Future Directions 

• Global Grid Forum 
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WHAT DO GRIDS DO? 


Grid software is middleware that sits on top of the network and the connected 
resources such as computers, storage and instruments. The grid software can provide an 
infrastructure on which to build collaborative environments that are large and distributed. 
They provide for security and provide the means to easily integrate distributed resources 
in a cost-effective manner. 


What Do Grids Do? 


• Grids provide the infrastructure 

- To dynamically integrate independently managed: 

• Compute resources 

• Data sources 

• Scientific Instruments (Wind Tunnels, Microscopes, Simulators, etc.) 

- To build large scale collaborative problem solving environments that are: 

• Cost effective 

• Secure 

• Grid software is "middleware" 


This is a Grid Enabled Infrastructure 



C’lfMMMilte* 


Resources 
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WHY USE GRIDS? 


The goal of grids is to provide software that makes in easy for users to use 
distributed resources, such as distributed computers, storage or even instruments. The 
grid is actually a set of tools that permits these distributed resources to be easily accessed 
— as if they were on the local system. These tools can also be used to develop distributed 
applications, They help the distributed application developer to focus on his applications, 
with the grid providing the software to handle the distributed access. 


Why Grids? 

For NASA and the general community today Grid 
middleware: 

- Provides tools to access/use data sources (databases, 
instruments, ...) 

- Provides tools to access computing (unique and generic) 

- Is an enabler of large scale collaboration 

• Dynamically responding to needs is a key selling point of a grid. 

• Independent resources can be joined as appropriate to solve a 
problem. 

- Provides tools for development of application-oriented 
frameworks 

- Provides value added service to the NASA user base for utilizing 
resources on the grid in new and more efficient ways 
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WHAT CHARACTERISTICS ARE NORMALLY FOUND IN A GRID 


Security is a fundamental aspect of a grid, with most grids basing their security on 
public key technology, which it used to protect at least the authentication 
information as it flows between the various sites on the grid. The IPG uses the Grid 
Security Infrastructure (GSI), based on the Globus toolkit, for its security. 

Using GSI, grids can support single sign-on, which means that after a user signs on 
one grid resource for a session, he is able to use other grid resources, on which he 
has an account, without any further identification or authentication required. 

Grids also provide a grid information service (GIS), that provides a single 
mechanism by which users can discover grid resources and associated information 
about the resource. 

Grids are designed to be scalable to a large number of resources. 

Finally, grids are designed to provide access to resources that may be under the 
control of different administrative groups. They are not designed to have 
centralized control. 


Normal Grid Characteristics 


• An underlying security infrastructure such as the Grid 
Security Infrastructure (GSI), which is based on public 
key technology 

- Protection for at least authentication information as it flows from 
resource to resource 

• Readily accessible information about the resources on 
the Grid via a single mechanism, the Grid Information 
Service (GIS) 

• Single sign-on 

• A seamless processing environment 

• An infrastructure that is scalable to a large number of 
resources 


• The ability for the grid to cross administrative 
boundaries 
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DISTRIBUTED SYSTEMS BEFORE THE GRID 


Before the development of the grid, people still developed distributed systems. 
Under these pre-grid distributed systems, a user was responsible for dealing with all of 
the complexities of the distributed environment. 


Before the Grid 



The User is responsible for 
resolving the complexities of 
the environment 



Ames Research Center 


Independent sites 
Independent 
hardware and 
software 

Independent user ids 
Security policy 
requiring individual 
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machine. 


■ 


IVIAS 


Division 


Figure 6 


180 


DISTRIBUTED SYSTEMS USING TODAY’S GRID 


The grid provides the middleware that ties distributed resources into a seamless 
environment. Using the grid, a user can make a request to the grid Information Service 
for information about the location and characteristics of grid resources such as processing 
and storage resource or instruments. With this information, the user can then launch an 
application that accesses the desired distributed resources through the grid middleware. 


^Request info 
^from the grid 

(2)Get response 

^3)Make selection 
and submit job 


The Grid Today 


User 


Application 

CD 3 f 
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Common Middleware 

- Abstracts 
independent, 
hardware, software, 
user ids, into a 
service layer with 
defined APIs 

- Provides 
comprehensive 
security, 

- Allows for site 
autonomy 

- Provides a common 
infrastructure based 
on middleware 
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DISTRIBUTED SYSTEMS USING TODAY’S GRID 


The key to the grid is that the underlying grid resources are abstracted into 
application programmer interfaces that simplify the development of distributed 
applications. While this is a significant step forward, this layer does not have much 
intelligence, which will define the next stage of grid development. 


^Request info 
I^Trom the grid 

©Get 


The Grid Today 


User 


Application 


response 

©Make selection 
and submit job 

The underlying infrastructure is abstracted into 
defined APIs thereby simplifying developer and 
— | user access to resources, however, this layer is 
not intelligent. 



Ames Research Center 


Common Middleware 

- Abstracts 
independent, 
hardware, software, 
user ids, into a 
service layer with 
defined APIs 
Provides 
comprehensive 
security, 

- Allows for site 
autonomy 

- Provides a common 
infrastructure based 
on middleware 
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THE NEAR FUTURE GRID WILL HAVE INTELLIGENCE 


The grid for the near future will have intelligent, customizable middleware that 
will sit between the current grid middleware and the application. This intelligent layer 
will perform brokering (the automatic selection of resources) and will provide 
information tailored to the specific needs of the user or application. 

Under the current grid, a user must have an account on each resource that is used, 
thus preserving local autonomy. Under the near future grid, if a local system agrees, the 
grid will then take responsibility for granting grid user’s access to these resources, where 
the user has not pre-established an account. 

Another key capability that will soon be available is the ability to field grid- 
enabled web services, that provide a standard API that can be accessed from applications, 
application-specific portals or command-line functions. 


The Near Future Grid 



Customizable Grid 
Services built on 
defined Infrastructure 
APIs 

• Automatic selection 
of resources 

• Information products 
tailored to users 

• Dynamic account 
access 

• Flexible interface: 
grid-enabled web 
services based, 
application-specific 
portals, command 
line, APIs, 
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THE NEAR FUTURE GRID WILL HAVE INTELLIGENCE 


With this more intelligent grid, the users and application developers will be able 
to focus more on the science and engineering applications and not on the distributed 
systems management aspects of their systems. 


The Near Future Grid 



(service oriented) 


Infrastructure 




Customizable Grid 
Services built on 
defined Infrastructure 
APIs 

• Automatic selection 
of resources 

• Information products 
tailored to users 

• Dynamic account 
access 

• Flexible interface: 
grid-enabled web 
services based, 
application-specific 
portals, command 
line, APIs, 
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HOW THE USER AND APPLICATION DEVELOPERS SEE A GRID 


A grid is really just a set of tools that can be accessed through application 
programmer interfaces or command line functions. These tools will be augmented with 
services that will be structured as grid-enabled web services, which are re-usable such 
that one or more of these can be combined to make a more complex services. 

Once a user has authenticated to the grid, he can use any of the various services 
that are shown on the slide as if these were part of his local machine. He does not have to 
re-authenticate to use any of these, with the grid handling the requirement to pass 
identification and authentication information among the resources that are used. 


How the User and Application 

Developers See a Grid 

• A set of grid functions that are available as 

- Application programmer interfaces (APIs) 

- Command-line functions 

- Grid-enabled web services 

• After authentication, grid functions can be used to 

- Spawn jobs on different processors with a single command 

- Access data on remote systems 

- Move data from one processor to another 

- Support the communication between programs executing on different 
processors 

- Discover the properties of computational resources available on the grid 
using the grid information service 

- Use a broker to select the best place for a job to run and then negotiate 
the reservation and execution (coming soon). 
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OUTLINE 


In the next section we will look at the current state of the IPG. 


Outline 


• What are Grids? 

• Current State of Information Power Grid (IPG) 

• Overview of IPG Middleware 

• Future Directions 

• Global Grid Forum 
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IPG LOCATIONS 


The IPG currently has resources located at the five NASA Centers shown on the 

map. 



IPG Locations 


Ames Research Center Glenn Research 


Goddard S 


Jet Propulsion Labs 


Ames Research Center 


Langley Research Center 
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IPG RESOURCES 


The IPG currently has the computational resources shown. 


IPG Resources 

• Server Nodes 

- 1024 CPU, single system 
image SGI, Ames 

- 512 CPU SGI 02K, Ames 

- 128 CPU Linux Cluster, 
Glenn 

- 124 CPU SGI 02K, Ames 

- 64 CPU SGI 02 K, Ames 

- 24 CPU SGI 02 K, Glenn 

- 16 CPU SGI 02 K, Langley 

- 16 CPU SGI 02K, Ames 

- 8 CPU SGI, 03 K, Langley 

- 4 CPU SGI 02 K, Langley 

• Client Nodes 

- 16 CPU SGI 0300, JPL 

- 8 CPU SGI 0300, Goddard 


• Wide area network interconnects of at least 100 Mbit/s 

Ames Research Center 

^2 Division 

NAS 
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OUTLINE 


The next section will delve more deeply into the nature of the IPG middleware. 


Outline 


• What are Grids? 

• Current State of Information Power Grid (IPG) 

• Overview of IPG Middleware 

• Future Directions 

• Global Grid Forum 
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IPG IS BUILD ON GLOBUS TOOLKIT 2 


The IPG, as are most of the grids in the world, is built on Globus Toolkit 2 (GT2). 
The Grid Security Infrastructure (GSI) is based on X509 certificates, secure socket layer 
(SSL) and Transfer Layer Security (TLS). This supports a GSI-enabled Secure Shell 
(SSH) and GridFTP (a high performance GSI version of FTP). 

The Grid Information Services is based on LDAP (lightweight Directory Access 
Protocol) which supports the Monitoring and Discovery Service (MDS), which provides 
a directory of grid resources and attributes. 

Finally, the remote execution of jobs is supported by the Globus Resource 
Allocation Manager (GRAM), which provides an interface to various batch schedulers 
(e.g., PBS and LSF), was well as systems that permits users to directly execute jobs via 
fork. It permits the launching of remote jobs. 


IPG Uses Globus GT2 Software 


•login 

•execute commands 


•get and put files 
•3rd party copy 
•interactive file management 
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IPG/GLOBUS DEPLOYMENT ARCHITECTURE 


To support the grid information service of a deployed grid, a Grid Resource 
Information Service (GRIS) captures local information from each resource and forwards 
this to a Grid Index Information Service (GIIS), that provides a single source for 
information about a particular grid. 

Users, applications or web portals can use Globus client services to access any of 
the grid tools and services. 


IPG/Globus Deployment Architecture 




Figure 17 
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ADDITIONAL SERVICE UNDER DEVELOPMENT BY THE IPG PROJECT 


To provide the added intelligence needed to facilitate the development of grid 
applications and the use of the grid by users, the IPG project is developing a Job Manager 
to manage the reliable execution of a job on the grid. The Job Manager will stage the 
necessary files needed by the application, monitor the progression of the work and then 
post-stage the results, cleaning up any files that may remain from the execution. 

The Job Manager is supported by the Resource Broker that provides the user with 
suggestions about where to run his application, based on supplied information about the 
application. 


Additional IPG Services 


• Job Manager 

- Reliably execute a job 

• Set of files to pre-stage 

• Executable to run 

- Including directory, environment variables 

• Set of files to post-stage 

• Resource Broker 

- Provide suggestions on where to run a job 

- Input 

• Which hosts and operating systems are acceptable 

• How to create a Job Manager Job for a selected host 

- Selection made using host and OS constraints and host load 

• Interactive system: # free CPUs 

• Batch system: Amount of work in queue / # CPUs 

- Output 

• Ordered list of Job Manager Jobs (suggested systems) 
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ROLE OF ADDITIONAL IPG SERVICES 


Applications will be able to consult the broker for suggestions as to the best grid 
resources to use, given the current workload on each of these resources. This information 
will then be used to run the application on the suggested resources, using the job manager 
to stage necessary files and monitor the progress of the work and then post stage any files 
at the end of the work. 


Role of Additional IPG Services 



•System requirements 
•How to use systems 
Output: 

•Suggested Job Manager Jobs 


Job Manager Job: 

•Files to pre-stage 
•Application to execute 
•Files to post-stage 
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OUTLINE 


Next we will briefly look at future directions. 


Outline 


• What are Grids? 

• Current State of Information Power Grid (IPG) 

• Overview of IPG Middleware 

• Future Directions 

• Global Grid Forum 



Ames Research Center 


WA 


IVIAS 


Division 


Figure 20 


194 


OPEN GRID SERVICES ARCHITECTURE (OGSA) 


The Open Grid Services Architecture is the grid community’s adoption of the 
web services work (which other than the name has little to do with the web) as a way of 
delivering services. Grid-enabled web services provide a standard Web Services 
Description Language (WSDL) description of the service and a specified protocol, which 
for now is SOAP, for accessing these services. Grid-enabled web services provide a self- 
describing way to offer services that can be included as components of other grid-enabled 
web service. 

Standards are under development by the Global Grid Forum to specify the 
interfaces and the nature of the service-management capabilities (creation, destruction, 
lifetime) that are to be associated with each service. 

One of the key contributions that grid-enabled web services offer over web 
services is that they will be built to use grid security, such as the Grid Security 
Infrastructure. 


Open Grid Services 

Arch itect ure (OGSA) 

• New framework for creating grid services 

• Based on web services 

- Standards to build and use distributed services 

• Service description language: WSDL (Web Service Description Language) 

• Service invocation: SOAP (Simple Object Access Protocol) 

• OGSA extends web services with: 

- Requirements for service interfaces such as providing service data and 
notifications 

- Service management (creation, destruction, lifetimes) 

- Security 

• Standards being developed in the Global Grid Forum 



Ames Research Center 


WA 


IVIAS 


Division 


Figure 21 


195 


GLOBUS TOOKKIT VERSION 3 (GT3) 


A key first application of OGSA will be the next version of the Globus Toolkit, 
which is called Globus Toolkit Version 3 (GT3). The various grid services offered by the 
Globus Toolkit will be offered as grid-enabled web services. 

GT3 and OGSA will revolutionize how services are offered on the grid, since it 
will make it easy to include existing services in more complex, application-specific 
services. 

The IPG will transition to GT3 as soon as it is stable and in a way that minimizes 
any impact to existing users. 


Globus Toolkit Version 3 (GT3) 

• Large change from GT2 to GT3 

- New implementation 

- Java-based instead of C-based 

- GT3 based on OGSA 

• GT3 will provide equivalent services to GT2 

• Alpha version of GT3 currently available 

• GT3 and OGSA will revolutionize 

- how services are provided on the grid and 

- how grid applications are developed 

• IPG will transition to GT3 soon as it is proven stable, while minimizing 
the effect on existing IPG users. 

• Transition should have minimal impact on IPG users 

- Globus will maintain many of the existing programs 

• IPG Services will follow OGSA 
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FOCUS ON IPG HANDLING OF DATA 


As the IPG completes is work on the resource management and utilization phase 
of the grid services, it will focus on the data handling aspects of the grid. This is a 
critical function for NASA because of the large volume of distributed data that is found 
in the various NASA archives, such as those associated with Earth science. 

This new focus will look at providing access to NASA archives, using such 
existing grid-enabled systems as the Storage Resource Broker, developed at the San 
Diego Supercomputing Center. Of particular interest will be providing access to data 
stored on both tertiary storage (mass storage systems) and data stored on disk-resident 
data pools. 

This effort will build on the considerable amount of work that has been performed 
on data grids by the international grid community. 


• Goal: Intelligently manage data in a grid 

• NASA data is inherently distributed e.g., various Earth 
science archives, including the one at LaRC 

• Important focus of IPG 

• Access to files 

- Initial use of grid-enabled Storage Resource Broker 

- Data staging and replica management building on grid community 
research 

- Need grid support for file metadata 

• NASA data can be on 

- Disk-resident data pools 

- Tertiary storage data archives 

• Will build on considerable data grid work from the 
international grid community 


Focus on IPG Handling of Data 
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OUTLINE 


The last section will focus on the Global Grid Forum. 


Outline 


• What are Grids? 

• Current State of Information Power Grid (IPG) 

• Overview of IPG Middleware 

• Future Directions 

• Global Grid Forum 
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GLOBAL GRID FORUM BACKGROUND 


The Global Grid Forum is an international group that mirrors for grids what the 
Internet Engineering Task Force (IETF) has done for the network though its standards 
work. It was formed in 2001 as a combination of similar grid work in the North America 
and Europe and now encompasses the Asia/Pacific grid work as well. It meets three 
times a year in different parts of the world. 


Global Grid Forum Background 

• Began in 2001 as merger of previous regional grid 
forums. 

• Now includes grid technical communities in North 
America, Europe and Asia Pacific 

• Meets three times per year, alternating between North 
America and Europe and Asia/Pacific 

• Modeled after IETF (Internet Engineering Task Force), 
which sets Internet standards. 

• GGF7 was just held in Tokyo, Japan with over 700 
attendees 

• GGF8 will be held in Seattle, WA in June 25-27, 2003 
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GLOBAL GRID FORUM PURPOSE AND ORGANIZATION 


The main purpose of the Global Grid Forum is to provide an international grid 
organization that can support the fair and representative development, review, approval 
and release of both best practices and standards for the grid. 

It is organized into two types of groups. The Working Groups are of limited 
duration and are focused on the goal of producing some specific best practice document 
or standard. Currently there are 24 Working Groups. 

The Research Groups are organized to address grid issues that are not yet ready 
for a best practice document or a standard. Currently there are 20 research groups. 


Global Grid Forum 

• Supports mechanism for formal review, approval and 

release of 


- Best practices guides 

- Grid standards 


• Organized into two types of groups 


- Working Groups that are expected to produce best practices 

documents and standards (24 groups) 


- Research Groups which coordinate research on future grid 

needs (20 groups) 
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GGF WORKING GROUPS 


The slide lists the current GGF Working Groups. Details about each of these 
groups and the current set of documents and standards on which they are working can be 
found on the GGF web site at www.ggf.org. 


GGF Working Groups 

•Grid Checkpoint Recovery 

•Discovery and Monitoring Event 

•New Productivity Initiative 

Description 

•Open Grid Services Architecture 

•Network Measurement 

•Open Grid Services Interface 

•Grid Information Retrieval 

•Open Source Software 

•Previous activities of the Peer to Peer 

•Data Access & Integration Services 

Working Group 

•GridFTP 

•Distributed Resource Management 

•Authorization Frameworks and 

Application API 

Mechanisms 

•Grid Economic Services Architecture 

•Certificate Authority Ops 
•Grid Certificate Policy 
•Grid Security Infrastructure 
•Open Grid Service Architecture 

•Grid Resource Allocation 
Agreement Protocol 
•OGSA Resource Usage Service 
•Scheduling Attributes 

Security 

•Scheduling Dictionary 

•CIM based Grid Schema 

•Usage Record 

Ames Research Center 

^2 Division 
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GGF RESEARCH GROUPS 


The slide lists the current GGF Research Groups. Details 
groups can be found on the GGF web site at www.ggf.org. 


about each of these 


GGF Research Groups 

• 

Advanced Collaborative 

• Data Replication 


Environments 

•Data Transport 

• 

Advanced Programming Models 

• Grid Benchmarking 

• 

Applications and Test Beds 

• Relational Grid Information 

• 

Grid Computing Environments 

Services 

• 

Grid User Services 

•Appliance Aggregation ( 

• 

Life Sciences Grid 

•0GSA-P2P-Security 

• 

Production Grid Management 

• Grid High-Performance Networking 

• 

Accounting Models 

• Persistent Archives 

• 

Grid Protocol Architecture 

• Site Authentication, Authorization, 

• 

Semantic Grid 

and Accounting Requirements 

• 

Service Management Frameworks 
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WHY IS THE GLOBAL GRID FORUM IMPORTANT 


The primary reason that the GGF is important is that it will result in grid 
standards and grid standards will encourage commercial companies to make grid 
products that satisfy these standards. Standard based products should be more marketable 
than products that do not satisfy standards. 

In addition the GGF provides an arena for various application- specific 
requirements to be injected into the international grid community. Currently there are a 
number of application-specific research groups at GGF that may, as the need is found, 
develop application-specific standards or influence other standards work to address needs 
unique to a particular application area. 


Why is the G lobal Grid Forum 

Important 

• It will result in grid standards 

- It will encourage commercial products since there will be 
standards which the products can meet 

- Products that meet accepted standards should be more 
marketable 

• It provides a forum to get application-specific 
requirements injected into the grid development efforts 
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REUSABLE COMPONENTS FOR GRID COMPUTING PORTALS 


Marlon Pierce 
Indiana University 
Bloomington, IL 
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Reusable Components 
for Grid Computing 

Portals 


Marlon Pierce 
Community Grids Lab 
Indiana University 

• 

• 

• 

• 

• 

• 

• 

• 

• • 

• •• 
• • 

• • 

• 


Figure 1 


Grids Today and Tomorrow 


• • 
• • 


• Grid software enables loosely coupled, globally 

distributed computing 

• “Virtual Organizations”. 

• What does that really mean? 

• Specific services such as global authentication, 
resource allocation management, aggregated 
information services 

• Centered around a few wire protocols and service 
implementations 

• What’s next? Open Grid Service Architecture 

• Use XML (WSDL) to provide a service definition 
language. 

• Extend WSDL to support metadata about services. 


Figure 2 
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What Is Missing? 



• Grids are designed to enable Virtual 
Organizations. 

• Inter-organizational collaboration 

• But we must also support the Real User 

• Provide access to the Grid from any computer 
(or anywhere). 

• Provide user interfaces to Grid services. 

• Provide customizable front ends that contain 
the service front ends. 

• Grid Computing Environments 

• Browser-based Web portals 


Figure 3 


GWD-R ( draft- ggt-ogsi- gridservice- 23 ) 


February 17, 2003 


Protocol 1 
(binding) 
specific Stub 


Protocol 2 
(binding) 
specific Stub 


Protocol 3 
(binding) 
specific Stub 


Protocol 4 
(binding) 
specific Stub 



Invocation 
of Web 
service 


\ 


Figure 1 : A possible client-side runtime architecture 

Figure I depicts a possible (but not required) client-side architecture forOGSI. In this approach, 
there is a clear separation between the client application and the client-side representation of the 
Web service (proxy), including components for marshalling the invocation of a Web service over 


Figure 4 
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Grid Computing Environments 


• Organizations setting up Grids have seen the value of 
developing user environments, or Grid Computing 

Environments. 

• 28 articles in November-December 2002 issue of 
Concurrency and Computation: Practice and Experience. 

• IPG Launchpad, HotPage, Alliance Portal, and others 

• World-wide development community interacts through 
the GCE research group in the Global Grid Forum. 

• G. Fox (III), D. Gannon (IU), and M. Thomas (TACC) co- 
chair. 

• Grid portal technology is coming of age 

• Reusability of components 

• Common frameworks 


Figure 5 


Example GCE: Gateway Portal 


• • 
• • 


• Developed for DOD supercomputing centers (ARL 
and ASC MSRCs). 

• Support source-restricted (commercial or otherwise) 
applications 

• Ansys, Abaqus, ZNSFlow, Fluent 

• Developed to support typical, if simple, high 
performance computing services 

• Batch script generation, job submission and monitoring, file 
management and transfer. 

• Do it all securely 


Figure 6 
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HK MTIWAY 


Welcome to the 

Gateway Computational Science Portal 


The Gateway Science Portal is a tool for remotely accessing co: 
For more information on the Gateway project, please visit our v 
Submit Job [ Submit Job: 

Choose the application and h 
your job. 

Archive | Problem Archive: 

This section allows you to ed 
jobs 

Portal Admin Portal Administadon: 


3 File Browser Version 1.00 


This section allows you to A< 
modify applications with cust 


File Edit View 

Favorites Tools Help 


Job Monitor 


The job monitor will refresh every 30 sec 

job-ID prior na 

me user state submit/start at queue master ja- 


456801 0 Sample_Pro pierceme 1 02/06/2002 19:35:01 ade 


File Browser 


To enter a subdirectory or download a file, simply click its folder. 

To upload a file from your desktop to the HPC directory shown, please use the “Upload File" button below. 
Index of /home/airforce/pierceme 



Directory 

49152 

Feb 6 19:35 


Directory 

12288 

Feb 6 10:06 

.desktop-clam.arl.hpc.mil 

Directory 

4096 

Jan 17 16:01 

.desktonhost 

Directory 

4096 

Jan 17 15:59 

.netscape 

Directory 

4096 

Jan 17 16:30 

.ssh 

Directory 

4096 

Jul 20 2000 

.vnc 

Directory 

4096 

Dec 21 14:03 

vnc old 

Directory 

4096 

Nov 7 11:05 

AnsvsTest 

Directory 

4096 

Jan 18 09:00 

BuildGrid 

Directory 

4096 

Sep 22 21:57 

CVSROOT 

Directory 

4096 

Oct 16 15:21 

Desktop 

Directory 

4096 

Jan 17 15:59 

FDTD 

Directory 

4096 

Dec 12 16:58 

GEM 

Directory 

4096 

Oct 1209:02 

GatewavRep 

Directory 

4096 

Nov 5 09:50 

Isina 

Directory 

4096 

Aug 15 13:40 


I^Done 


« Internet 


Figure 7 


Characteristics of Portals 

• Framework contains user interfaces to the services. 

• Backend services accessed through service proxies. 

• The convergent/emergent architecture is a three tiered 
model. 


Grid and 
Web 


JDBC, 
Local, or 



The three-tiered architecture is a standard for accessing 
Grid and other services. 


• • 
• • 


Figure 8 


210 





The three-tiered architecture is a standard for accessing 
Grid and other services. 


Figure 9 


Sharing Portal Services 


• • 
• • 


• Given that everyone builds essentially around 
the same architecture 

• How do I build a client to interact with someone else’s services? 

• How do I build a compatible service implementation? 

• How can I take someone else’s end-to-end solution and plug it 
into my portal. 

• How do I avoid reinventing basic services like login, view 
customization, access restrictions on interfaces. 

• To explore possible solutions, we chose to 
implement a new portal project, QuakeSim, 
around the Web services and Portlet models. 


Figure 10 
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QuakeSim Portal 

• A number of simulation methods for studying earthquakes are being 
developed by GEM consortium including: 

• Simplex, Disloc (JPL) 

• Virtual California (UC-Davis) 

• PARK codes (Brown) 

• As codes become more robust and accepted, problems emerge: 

• Need to manage information about distributed data sources: 
multiple databases, sensors, simulated data. 

• Need to organize, manage information about multiple code 
installation sites. 

• Need to simplify access to data, use of codes, and use of 
visualization/analysis tools for broad range of users 

• Need to link together 

• NASA funded activity to develop SERVOGrid 
Interoperability framework 


Figure 1 1 



Host 1 Host 2 Host 3 


Figure 12 
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Computing Portal Grid Web Services 

• We have built a suite of general purpose Grid Web 
services for managing distributed applications. 

• Core Computing services define general purpose 
functions: 

• Ex: job submission, file transfer, job monitoring, management of 
jobs and results 

• Described as a GridShell as plays same role to Grid that Shell 
does for UNIX on a single machine 

• Application Grid Web services include metadata about 
applications. 

• Built on top of core services. 

• Original application NOT changed 

• We have developed a toolkit that allows one to convert 
general software packages into Grid Web Services and 
manage application collections 


Figure 13 


Application Grid Web Services 



• AGWS are designed to make scientific applications (i.e. 

earthquake modeling codes) into Grid Resources 

• AGWS services are described by two XML Schemas: 

• Abstract descriptors describe application options. 
Used by the application developer to deploy his/her 
service into the portal. 

• Instance descriptors describe particular user choices 
and archive them for later browsing and 
resubmission. 


Figure 14 
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User Application Selection and Submission b 


Please select a code and host machine from the following list of applications. When you have made your choice, click the "Make Selection" button at the bottom of the page. 

■^ Disloc 

Osolar.uits.indiana.edu 


Ogrids.ucs.indiana.edu \ 


0Test2 
-^ Simplex 
Osolar.uits.indiana.edu 


Ogrids.ucs.indiana.edu 


Onoahsark.ucs.indiana.edu 


Otest2 

- Wc STRESS GREEN 
SG COMPRESS 
* VC 1NIT SER 
l VC_SER 


Make Selection 


Select desired 
application and 
host 



_L 

Generate 
script for job 
submission 


• 

• •• 

• • • 


Welcome to Jetspeed -> Home 

Customize Logout Edit Account ( gateway I 

Server date: Dec4, 2002 5:11:47 PM 



PBS Scrim Generator 


Please provide the following information needed to generate the PBS queper'raiptthat will be run on soldr.uits.indidnd.edu , 

Amount of Memory: 32mb v 

Job Nome: 

WollTime (hh:mm:ss) [ jl5min v| 

Number of CPUs\ \ 

EmoikU When job begins Email Dwhen job ends Email □ When job aborts Email 

The code y ou have selected takes 1 input file, S ee the code documentation for details. 

Input File: 


, Plea se sp ecif 1 


The application g enerates 1 outputfile. Please spe cify the full path name of the directory on the HPC se 

Output File: 


sr where you would like this file to be placed. 


Make Selections"] 


Figure 15 
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Portlets for Reusable Portal :s 

Components :: 

• What we found was that groups did not really want to use 
common interfaces so much as share end-to-end services (user 
interfaces-client stubs-service implementations). 

• Portlets/containers provide a simple way to do this. 

• The container implements all portal specific services 

• Manages user customizations, logins, access controls 

• Container treats all web content as generic ‘portlet’ objects. 

• Controls which portlets are displayed and how they are arranged. 

• Portlets and containers are implemented in Java 

• Tomcat webapp 


Figure 17 



NCSAlJLf>5 

UWVtWm 0* HMKHUT MMIU-CHUIKKM ALLIANCE ‘ ' S ^ 


kthe globus project' 


Java CoG Kit 
vwgiobus.org/cog 


■Ml 

■Hi 


/ visualization | grid computing / o pen-source te chn o log ies / co nve rg e nt:: 


Welcome Jay Alameda 
Customize: HTML WML 
Edit account: demo 
Logout 



Error: proxy null 

Back 


Direction: 

Before using Grid FTP Client portlet, you need to retrieve your proxy through ProxyManager for authorization and 
authentication purpose. ProxyManager is another portlet in xportlets package. 

Icons, such as D fc 1 H and so on, represent different types of files. 0 means a folder and it is clickable for 

exploring deeper directory contents. ^ indicates a parent directory. means it is downloadable by simply 
clicking the icon. To transfer one or more files from one host to the other, you can specify the files to be 
transferred by checking the check boxes, then click'Transfer-->" or“<--Transfef button. To upload a file from your 
local machine to the current directory, you can choose any file by clicking "Browse..." or enter the file location 
directly in the text box, then click "Upload file". If the file is uploaded successfully, it will show in the directory: 
otherwise, it will show error message. 


kportleti: LdapBiowssr 0 - □ 

LDAP Server URL: palomar.extreme.indiana.edu:389 
Current Directory: dc=cs,dc=indiana,dc=edu 

fl cn=Manauer 
0 o=soat)rmi 
D o=events 
Q o=comt)onents 
Q o=services 
Qoftests 
C) Ofxmessages 
□ o=grid-tests 
Q o=Alek Test 
lH cn=EchoService 


Fill out the following parameters and click 'Get Proxy' to retrieve your proxy from 
the MyProxy server specified below: 

Hostname: |rainier.extreme.indiana.edu 
Port: [7512 ~~| 

Username: |your_myproxy_username ~ 

Password: | 

Lifetime: \2 ~| (hours) 

Store for duration of user session only? □ 


Direction: 

ClickOto get the subdirectory. S represents a leaf node of LDAP directory. Since a leaf does not have any child, 
cannon be clicked. To get the attributes of a node, click the link with its name, no matter it has any child or not, 
sav o=xmessaaes. or cn=Manaaer . Click^to return to the upper level. 

If a leaf is a service, a "Go to service" button will show under the attribute table with all available methods 
for choice. Choose one of the choices and click the button. On the next page show all the required 
parameters. Enter all the parameters and click "Invoke" button, you will get the result on the next page. 
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Value of the Portlet Approach 



• With portlets, we have a common 
infrastructure for managing content. 

• I don’t have to reinvent login, user customization 
services. 

• But I may choose to add my own service 
implementation in a well defined way. 

• Content (and service user interfaces) are 
added in a well defined way 

• Edit an xml registry file. 


Figure 19 


Portlet Implementations 



• Several groups (IU, TACC, NCSA, UMich) are using 
Jetspeed 

• Open source portlet implementation from Jakarta 

• We extend it to 

• Add custom services for message boards, chats, etc. 

• Develop specific portlets to Grid services (like GridFTP). 

• Build general purpose portlets to support needs of Grid 
service interfaces 

• Session state conversations, multipage content, security 

• Bridge to legacy JSP and non-Java Web interfaces 


Figure 20 
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The Grid Portal Consortium's initial architecture aggregates 
multiple services into a single portal using portlet containers. 


Figure 21 


Portlet Longevity 



• Portlets have become popular in commercial 
enterprise servers 

• The portlet API is being standardized through 
the Java Community Process. 

• Participants include IBM, Oracle, BEA, and 
others. 

• We anticipate or will contribute to building the 
open source reference implementation of the 
standard. 


Figure 22 


217 


Portlets and Portal Stacks 



• User interfaces to Portal 
services (Code Submission, 
Job Monitoring, File 
Management for Host X) 

are all managed as 
portlets. 

• Users, administrators can 
customize their portal 
interfaces to just precisely 
the services they want. 


Y *gr 


Aggregation Portals 


User facing Web 
Service Ports 


0 ) 

CD 

CO 

CD 

CO 

CD 

O 

c 


Application Grid Web 
Services 



Figure 23 


Future Developments 



• User interfaces and services need to get 
much more sophisticated, intelligent. 

• Case-based reasoning interface for Earthquake 
simulation codes. 

• More standard collaboration services as portlets 
• Whiteboards, chat interfaces 

• Ubiquitous access in a standard fashion 

• Portlet repositories to allow community 
sharing of reusable components. 


Figure 24 
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More Information 


• My Email: iarpierc@indiana.edu 

• Gateway homepage: 

• More publications: 

http://otloortal.ucs.indiana.edu . 


Figure 25 
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Research by Federal Agencies That Will Affect 
Future Computing Paradigms for Aerospace 


David Nelson 

National Coordination Office for Information Technology Research and Development 

Arlington, VA 
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Figure 1 



National Coordination Office (NCO) for 
Information Technology Research and 
Development (IT R&D) 


Mission: To formulate and promote Federal information technology research and 
development to meet national goals 


• NCO Director reports to the Director of the White House Office of Science 
Technology Policy (OSTP) 

• Coordinates planning, budget, and assessment activities for the Federal 
multiagency Networking and Information Technology Research and Development 
(NITRD) Program 

• Supports the six technical Coordinating Groups (CGs) that report to the 
Interagency Working Group (IWG) on IT R&D 

- Research planning workshops, conferences, and meetings 

- Presentations, white papers, and research reports 

• Provides technical and administrative support to the IWG and President’s 
Information Technology Advisory Committee (PITAC) 

• Informs the public of Federal achievements and challenges in IT R&D 

- Maintains a Web site 

- Publishes annual budget documents in cooperation with the IT R&D agencies 

- Publishes PITAC reports 
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PUBLICATIONS 


Publications of the President’s Information Technology Advisory Committee 
include “Developing Open Source Software to Advance High End Computing,” that was 
handed out at the workshop. Open source software is software for which the human- 
readable source code is made widely available, either as public domain software, or 
copyrighted with a license that requires source code to be made available. Open source 
software is an important emerging factor that will affect future aero-space computing. 
Discussion of open source would be a useful topic for this meeting, but time did not 
allow its inclusion in this talk. 



Publications 



Annual publication of the Supplement to the President’s Budget also known 
as the “BLUE BOOK,” describes the NITRD Program 

http://www.itrd.gov/pubs/blue03/03BB-final.pdf 


• President’s Information Technology Advisory Committee (PIT AC) reports 


★ 



Through Information Technolo 



Developing Open Source Software to Advance High End Computing 
http://www.itrd.gov/pubs/pitac/pres-oss- 1 1 sepOO.pdf 


Digital Libraries: Universal Access to Human Knowledge 
http://www.itrd.gov/pubs/pitac/pitac-dl-9feb01.pdf 


Transforming Health Care Through Information Technology 
http://www.itrd. gov/pubs/pitac/pitac-hc-9feb0 1 .pdf 


Using Information Technology To Transform the Way We Leam 
http://www.itrd.gov/pubs/pitac/pitac-tl-9feb01.pdf 
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GRID TECHNOLOGY OPPORTUNITIES AND NEEDS 


The workshop on Middleware and Grid Technology, organized by the Large 
Scale Networking Coordinating Group, produced a report that will be published shortly. 
Some conclusions of the workshop are presented in these viewgraphs. 



Grid Technology Opportunities and Needs 


• Large Scale Networking Workshop on Middleware and Grid 
Technology, August 13-14, 2002 - to be published 

• Grid technologies foster collaboration and distributed access 
that are fundamental to the new ways of doing interdisciplinary 
research 


• Significant new capabilities are needed to support near-term 
needs of discipline sciences (Network for Earthquake 
Engineering Simulation, Large Hadron Collider at CERN, 
Genetics database, ...) 



• Generalized capabilities and standards are needed now to 

prevent each discipline science from developing its own unique 
Grid capabilities (Balkanization) 
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Grid Technology Needs, Concluded 


• Industry is not focused on the longer term research needed to 
further develop the Grid. Federal research is needed. 


• New technical capabilities are needed 

- Testbeds and prototypes for simulations and collaboratories 

- Persistent, reliable, high-performance infrastructure 

- Grid economics and accounting 

- Security implementation 

- Standards applying across disciplines and international boundaries 

- Policies for interacting, sharing, and accounting 

- Multidisciplinary, robust, easy-to-use Grid technology and tools 
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GRID COMMUNITES AND APPLICATIONS: 
HIGH ENERGY PHYSICS PROBLEMS SCALE 


Physics data from the Compact Muon Solenoid, a detector on the Large Hadron 
Collider at CERN in Geneva, Switzerland, will be managed using grid technology. The 
grid is hierarchical, in that data flow primarily from top to bottom and at each stage the 
flow rate decreases. The intent of the grid is to provide “seamless interaction” by 
physicists with each other and with the data. 

Similar grid structure, including Open Grid Systems Architecture and the globus 
toolkit, could be applied to large-scale NASA missions such as the Earth Observing 
System. 



Figure 6 
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GRID TECHNOLOGY SCENARIO FROM WORKSHOP 


The Virtual National Airspace Simulation Environment is a NASA-based 
scenario from the workshop. The scenario includes dealing with an in-flight emergency 
that cripples the airplane and requires special pilot responses. The viewgraph lists the 
grid technology requirements needed for this scenario. 



Grid Technology Scenario from Workshop 


• Virtual National Airspace Simulation Environment 

• Grid Technology Requirements 

- Access to distributed computational resources to support real-time 
simulations 

- Access to distributed simulation models 

- Access to distributed information resources 

- Real-time access to on-line sensor data, e.g. weather sensors, on-board 
aircraft sensors 

- Priority for commanding use of resources 

- Security, 

- Reliability, robustness for critical functions 

- Collaboration technology and user interfaces 

- Real-time monitoring and management of Grid tools and resources 
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Earth Simulator Has Inspired a New Look at 
U.S. High End Computing 


• Based on the NEC SX architecture, 640 nodes, each node with 8 vector 
processors (8 GFlop/s peak per processor), 2 ns cycle time, 16GB 
shared memory 

- Total of 5104 total processors, 40 TFlop/s peak, and 10 TB memory 







Source: http://www.es.jamstec.go.jp/esc/eng/outline/outline02.html 


Has a single stage crossbar switch(1800 miles of cable) 83,000 copper 
cables, 16GB/s cross section bandwidth 


700 TB disk space 


1.6 PB mass store 


Area of computer = 

4 tennis courts, 3 floors 
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PERFORMANCE MEASURES OF SELECTED TOP COMPUTERS 


This viewgraph presents three performance measures for the computers at the top 
of the Top 500 supercomputers (www.top500.org.) The performance measures include R- 
peak, the peak theoretical performance of the computer measured in Giga-Flops/second, 
R-max, the best performance on the Linpack program, also measured in Giga- 
Flops/second, and the Stream Triad benchmark, that gives the memory access rate for the 
calculation C(I) = A(I) + Q*B(I) for very large vectors, measured in Giga-Bytes/second. 



Figure 9 
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Several Federal Agencies Have Recently 
Examined High End Computing Needs 


• They are mostly using COTS-based HEC 

• Most expect COTS to be acceptable in near term, however: 

- Time-to- solution becoming too long 

- Too hard to program; too hard to optimize 

- Coordinated improvements are needed in hardware, software, and application 
algorithms 

- Rapidly escalating demand on HEC facilities 

• Some important applications/algorithms are not amenable to 
COTS-based HEC 

- Primarily due to non-local memory reference e.g., long vectors requiring 
gather-scatter operations 


to 


Figure 10 



Examples of Applications for Which COTS 
May he Unsuitable 


• Hypersonic air-breathing propulsion 

- Needs high memory-to-CPU bandwidth for multi-disciplinary analysis 

• Reusable Launch Vehicle Design 

- Needs high memory-to-CPU bandwidth 

• Protein Folding 

- Poorly parallelizable 

• Cryptoanalysis 

- Needs fast flat-memory model 

• Climate data assimilation 

- Part of problem not easily parallelizable, needs high memory-to-CPU bandwidth 
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Agency Conclusions 


• Further progress in HEC will require balanced, coordinated effort in 

- Research, development, and engineering of new HEC architectures and systems 

- Procurement of new COTS and custom systems 

- Better software (systems, middleware, and applications) 

- Better domain science (mathematics and algorithms) 

• HEC is a decreasing part of the technical computing marketplace. 

• COTS-based HEC is largely based on technologies developed for 
low- and mid-range markets (SMP nodes, low bandwidth 
interconnects). 

• Market pressure may result in future COTS systems being less 
responsive to HEC needs. 

• Federal funding of highest-performing HEC, including development 
of new systems, may be required. 


12 
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High End Computing Revitalization 
Task Force (HECRTF) Charge 


• Rationale: High End Computing (HEC) increasingly critical 

• HECRTF coordinated through National Science and 
Technology Council (NSTC) 

• To develop a plan that can guide future Federal HEC 
investments 

• Plan will lay out an overall strategy for these investments 

• Seek wide participation by Federal agencies developing or 
using HEC 

• Final report to be completed by August 2003, in time to be an 
input to FY 2005 budget 
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High Productivity Computing Systems 


Robert Graybill 
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INTRODUCTION 


High performance computing is at a critical juncture. Over the past three decades, 
this important technology area has provided crucial superior computational capability for 
many important national security applications. Unfortunately, current trends in 
commercial high performance computing, future complementary metal oxide 
semiconductor (CMOS) technology challenges and emerging threats are creating 
technology gaps that threaten continued U.S. superiority in important national security 
applications. 

As reported in recent DoD studies, there is a national security requirement for 
peta-scale high productivity computing systems. Without government R&D and 
participation, high-end computing will be available only through commodity 
manufacturers primarily focused on mass-market consumer and business needs. This 
solution would be ineffective for important national security applications. 

Improving system performance is no longer sufficient to increase system 
productivity. DARPA’S High Productivity Computing Systems (HPCS) program must 
also improve system programmability, portability, and robustness. HPCS is pursuing the 
research and development of balanced, economically viable high productivity computing 
system solutions for the national security and industrial user communities. 



High Productivity 
Computing Systems 



Goal: 


> Provide a new generation of economically viable high productivity computing 
systems for the national security and industrial user community (2007 - 2010) 





Applications: 

• Intelligence/surveillance, reconnaissance, cryptanalysis, weapons analysis, airborne contaminant 
modeling and biotechnology 


Impact: 

• Performance (time-to-solution): speedup critical national 
security applications by a factor of 1 0X to 40X 

• Programmability (time-for-idea-to-first-solution): reduce 
cost and time of developing application solutions 

• Portability (transparency): insulate research and 
operational application software from system 

• Robustness (reliability): apply all known techniques to 
protect against outside attacks, hardware faults, & 
programming errors 


HPCS Program Focus Areas 


Fill the Critical Technology and Capability Gap 
Today (late 80 ’s HPC technology) to Future (Quantum/Bio Computing) 


Figure 1 
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VISION 


Today’s high-end systems tend to fall into one of two domains: the vector 
supercomputer domain or the commodity high performance computer domain. Foreign 
computer vendors dominate the vector domain with Cray as the sole domestic supplier. 
A majority of the tera-scale computing installations in the United States consist of 
commodity HPCs. 

The High Productivity Computing Systems program will bridge the gap between 
the late-80’s based technology of today’s High Performance Computers and the promise 
of quantum computing for the Department of Defense. DARPA’s challenge is to develop 
a broad spectrum of innovative technologies and architectures integrated into a balanced 
total system solution by the end of this decade. 



Vision: Focus on the Lost Dimension of HPC 
“User & System Efficiency and Productivity” 



Moore's Law 
Double Raw 
Performance every 
18 Months 


New Goal: 
Double Value Every 
18 Months 


Fill the high-end computing technology and capability gap 
for critical national security missions 




Figure 2 
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SCHEDULE 


To achieve the aggressive goal of revolutionary HPCS solutions by the end of this 
decade, three top-level program phases have been identified to address the challenges of 
scalable vector and commodity HPC solutions for today and tomorrow. The three phases 
are concept study, research and development, and full-scale development. The one-year 
Phase I industry concept study, completing in June 2003, will provide critical technology 
assessments, develop revolutionary HPCS concept solutions, and supply new 
productivity metrics necessary to develop a new class of high-end computers by the end 
of this decade. 

The second phase of the HPCS program is a three-year research and development 
effort that will perform focused R&D and risk reduction engineering activities. These 
pursuits will result in a series of system design reviews, preliminary design reviews and 
risk reduction prototypes and demonstrations. The technical challenges and promising 
solutions identified during the concept study will be explored and prototyped by a full 
complement of commercial industry, university, and research laboratory researchers. 

Phase III, full-scale development, will be led by commercial industry. This phase 
will last four years and complete the detailed design, fabrication, integration and 
demonstration of the full-scale HPCS pilots. 


HPCS Program Phases I - 


Products 


HPCS 

Capability or 
Products 

Application 

Analysis 

Performance 

Assessment 


Industry 


Readini 
Fiscal Year 


/\ Reviews 

Industry Procurements 

Critical Program 
Milestones 



Phase III 

Full Scale Development 


jtpei 
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HPCS TEAMS 


Phase 1 HPCS vendors are listed below. HPCS concept study awards were made 
to industry teams led by IBM, Cray, SGI, Sun, and HP. Across these industry teams at 
least 20 different universities are represented. 

Throughout all three phases of the program, application analysis and performance 
assessment activities will be carried out. Some work is done by the HPCS vendors, for 
their own benefit. Much of the work is being done by national labs, universities and 
other organizations for the benefit of the entire HPCS program. These organizations 
make up the Applications Analysis and Performance Assessment Team. The team is 
being led by MITRE and MIT/Lincoln Laboratory. 


5PB3 HPCS Phase I Industry Teams 

Cray, Inc. (Burton Smith) 

Hewlett-Packard Company (Kathy Wheeler) 

International Business Machines Corporation 
(Mootaz Elnozahy) 

Silicon Graphics, Inc. (Steve Miller) 

Sun Microsystems, Inc. (Jeff Rulifson) 

Application Analvsis/Performance Assessment Team; 

MITRE MIT Lincoln Laboratory 

Figure 4 



Industry: 
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APPLICATION ANALYSIS AND PERFORMANCE ASSESSMENT 


The Application Analysis and Performance Assessment Team is studying those 
mission areas identified as having inadequate available computational resources and is 
coordinating with the HPCS Mission Partners to identify challenge applications that will 
serve as the requirements drivers for HPCS. The challenge application selection process 
started with inputs from the DDR&E and Integrated High-End Computing (IHEC) 
Mission Analysis studies, which identified areas where deficiencies in the present 
computing capabilities exist that affect mission performance. Consultations with HPCS 
Mission Partners generated lists of actual operational and research codes and an 
understanding of the partners’ software development processes and system utilization 
patterns. The team has identified full-scale applications, compact applications and kernels 
that represent the mission partners’ needs and supplied them to the HPCS Phase I 
vendors. The team is working to profile these applications and to characterize the 
underlying requirements in parallel with the HPCS vendors. 

The Application Analysis and Performance Assessment Team has also worked 
with the HPCS Phase I vendors on development of HPCS productivity metrics, and a 
framework that puts them into a concise context for HPCS. 



Figure 5 
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APPLICATIONS 


As reported in recent DoD studies, there is a national security requirement for 
high productivity computing systems. Without government R&D and participation, high- 
end computing will be available only through commodity manufacturers primarily 
focused on mass-market consumer and business needs. The HPCS program will 
significantly contribute to DoD and industry information superiority in at least the 
applications areas colored in red and blue on the chart. The HPCS program will create 
and supply new systems and software tools that will lead to increased productivity of the 
applications used to solve these critical problems. 

The HPCS mission areas highlighted in red and blue were chosen from two 
studies of national security computing needs. The DDR&E study performed by the Office 
of the Secretary of Defense focused on the national security requirements for high-end 
computers The Integrated High-End Computing (IHEC) Mission Analysis performed at 
the request of Congress was a much broader study exploring the requirements, key 
technologies, proposed long implementation/organization strategy, and funding 
projections. 


[H33 Application Focus Selection 

DDR&E Study 


IHEC Study 

• Operational weather and ocean • 

Comprehensive Aerospace 

forecasting 


Vehicle Design 

• Planning activities for dispersion • 

Signals Intelligence (Crypt) 

of airborne/waterbome 


Signals Intelligence (Graph) 

contaminants 


Operational Weather/Ocean 

• Cryptanalysis 


Forecasting 

• Intelligence, surveillance, 


Stealthy Ship Design 

reconnaissance 


Nuclear Weapons Stockpile 

• Improved armor design 


Stewardship 

• Engineering design of large 


Signal and Image Processing 

aircraft, ship and structures 


Army Future Combat Systems 

• National missile defense 


Electromagnetic Weapons 

• Test and evaluation 


Development 

• Weapon (warheads and 


Geospatial Intelligence 

penetrators) 


Threat Weapon Systems 

• Survivability/stealth design 

* Bioscience 

Characterization 


Figure 6 
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WORKFLOWS 


In conducting interviews with HPCS Mission Partners during Phase I, the 
Application Analysis and Performance Assessment Team found that three general 
workflows are representative of the Partners’ operations and needs. Workflows identify 
how Mission Partners use HPCs — they describe the iterative processes of software 
development and system utilization and define mission partners' priorities. 

The workflows that characterize HPCS missions are lone researcher, enterprise 
development and production/operations. For each class of user, the total time to solution 
is strongly dependent upon the coupling that exists between execution time and 
development time. The diagrams on the left represent a high-level view of the operational 
workflow, while the diagrams on the right represent the software development 
workflows. For example, the first row of the chart depicts the workflow of the “Lone 
Researcher.” His or her goal is to rapidly understand and solve a domain-specific 
problem. The overall execution cycle is characterized by rapid iterations between the 
development of new hypotheses or theories and testing those theories computationally. 
The development model is characterized by rapid prototyping. This is very different 
from the production/operations workflow in row three. Here the goal is to create a fielded 
system that will rapidly process external inputs to provide actionable data to decision 
makers. The overall execution cycle is driven by real-time considerations. The 
development cycle consists of both an initial development of the system and a 
maintenance cycle once it is fielded. 


^Researcfi&p^ 



HPCS Mission Work Flows 

Overall Cycle — 

Days to 
hours 
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HPCS Productivity Factors: Performance, Programmability, 
Portability, and Robustness are very closely coupled with each work flow 
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ASSESSMENT FRAMEWORK 


Common metrics (such as peak floating-point operations per second) are 
insufficient for understanding and assessing system capabilities. The Application 
Analysis and Performance Assessment Team has worked with mission partners and 
HPCS vendors to develop an appropriate assessment framework. The initial framework 
shown consists of: Productivity Metrics (e.g. development time and execution time); 
System Parameters (e.g. bandwidth, flops/cycle, size, power, lines-of-code); Workflows, 
Benchmarks and Systems models. The system parameters and benchmarks are depicted 
as inputs into an actual or modeled HPCS system and generate productivity metrics. The 
productivity metrics are depicted as inputs into mission workflow models, which can be 
used to determine the productivity (or value) of a particular system for a particular 
mission. Workflows provide insights on how the various mission partners will evaluate 
HPCS systems. 

Implicit Productivity Factors such as performance, programmability, portability 
and robustness are attributes of the both the system and the workflow and reflect the 
system capabilities and the needs of the mission. 


jjTrn HPCS Productivity Framework 


Activity & 
Purpose 


System Parameters 
(Examples) 


Productivity 

(Ratio of ◄ — 
Utility/Cost) 



Execution 
Time (cost) 


Productivity 

Metrics 


Development 
Time (cost) 



BW bytes/flop 
Memory latency 
Memory size 


Processor flop/cycle 
Bisection BW 
Total Connections 


Size (cuft) 
Power/rack 
Facility operation 


Code size 

Restart time (reliability) 
Code Optimization time 


Implicit HPCS Productivity Factors: 
Performance, Programmability, Portability, and Robustness 


jmi 
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SUMMARY 


The HPCS program has received very positive response from the vendor, 
government, and university communities. HPCS represents the first comprehensive high- 
end computing program since the early 90’ s. The focus on productivity or the ability to 
easily program highly parallel systems with high sustained performance across a 
spectrum of computing applications represents not only a significant challenge but an 
opportunity to fill a major gap in realizable parallel computing. HPCS provides the 
vendors with an incentive to break out of the current evolutionary computing 
development paradigm by exploring new innovative technologies, architectures, and 
programming techniques. The very active and synergistic participation of the DoD users, 
universities, and vendors in all phases of this program is beginning to pay off. HPCS is 
laying the foundation for future larger scale programs such as the one proposed in the 
IHEC Congressional Study Report. 


HPC Community Reactions 


• DoD User Community 

- Active participation in reviews 

- Providing challenge problems 

- Linking with internal efforts 

- Providing funding synergism 

• Industry 

- Finally an opportunity to develop a non evolutionary vision 

- Active program support (technical, personnel, vision) 

- Direct impact to future product roadmaps 

• University 

- Active support for Phase 1 (2X growth from proposals) 

• Extended Community 

- HPCS strategy embedded in Congressional IHEC Report 


Productivity a new HPC Sub-discipline 




Figure 9 
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BUILDING A COLLABORATIVE BRIDGE - TECHNOLOGY 
RESEARCH, EDUCATION, and COMMERCIALIZATION CENTER 


Jonas Talandis 

National Center for Supercomputing Applications 
University of Illinois Urbana/Champaign 
Chicago, IL 
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BUILING A COLLABORATIVE BRIDGE - TECHNOLOGY RESEARCH, 
EDUCATION, and COMMERCIALIZATION CENTER 


The Technology Research, Education, Commercialization Center (TRECC) is a 
technology center located in Dupage County, IL, west of Chicago. TRECC is sponsored 
by the United States Office of Naval Research. 



Building a Collaborative Bridge - 
Technology Research, Education, and 
Commercialization Center 
(TRECC) 

NASA Langley Research Center 
Hampton, VA 
March 18, 2003 

Jonas Talandis 

National Center for Supercomputing Applications 

ionast@ncsa.uiuc.edu 

www.trecc.org 


University of Illinois at Urbana-Champaign T c 
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TRECC MISSION 


The mission of TRECC is to accelerate the development of innovative ideas, to 
develop new education applications and learning systems, to demonstrate on-the-horizon 
information technologies, and to incubate start-up technology businesses. 


TRECC Mission 


• Showcase Technology 
Research 

• Develop Educational 
Applications and 
Learning Systems 

• Technology Transfer and 
Commercialization of 
Emerging Technologies 


University of Illinois at Urbana-Champaign T c 



Figure 2 


246 




WHO ARE WE? 


The Technology Research, Education and Commercialization Center, or TRECC, 
is a University of Illinois at Urbana-Champaign (UIUC) program, managed by the 
National Center for Supercomputing Applications (NCSA), sponsored by the Office of 
Naval Research (ONR). 



Who Are We? 


• The U of I and NCSA 

- Program Management, 
Technology Research, and 
Continuing Education 

• Office of Naval Research 

- Funding and Direction 

• Battelle Memorial Institute 

- Commercialization and 
Small Business Support 


NCSA 


University of Illinois at Urbana-Champaign 


Figure 3 


247 





WHAT DO WE DO? 


TRECC sponsors development of and showcases technologies developed by our 
partners in the National Computational Science Alliance (Alliance) of which NCSA is the 
leading-edge site. Battelle Memorial Institute (Battelle) of Columbus, OH is the sub- 
contractor responsible for Small Business Assistance and Client Services. They match 
private-sector industry with appropriate government technologies, resources or interests. 
UIUC provides Education, Training and Learning Research in the form of Business 
Education, Continuing Ed, directed Academic Outreach and e-Learning programs. 


What Do We Do? 


Technology 
Research 
University of lllin< 


3 Key Thrusts 



Tech Transfer and 


Small Business Support 

Commercialization 


Battelle 


Commercialization 


Education/T raining 

University of Illinois 


Seminars & Workshops 
Training Courses 
Continuing Education 
MSTE 


N C§A University of Illinois at Urbana-Champaign 


T R /E C C 
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TECHNOLOGY RESEARCH/DEMONSTRATION 


TRECC exposes government and businesses to leading edge infrastructure 
technologies. Our Grid presence provides the community accessibility to resources and 
serves as a application test-bed for grid-related applications. TRECC’ s collaborative 
back-bone is the Access Grid, among other video tele-conferencing (vtc) and immersive 
technologies, in which we assist our clients in their deployment of. 


Technology 

Research/Demonstration 


• Leading Edge Infrastructure Technologies 

- Grid Facilities 

- Community Portal to National Technology Grids 

- Scientific, Engineering and Educational Applications to 
Validate the Grid’s Commercial Relevance. 

- Collaborative Technologies 

- Access Grid Node and other Tele-Immersive 

- Partner Deployment Assistance 

- Deployable Demonstrations 

- Distributed Cluster Computing Architectures 

- Advanced Display Technologies 

- Data Mining and Information Visualization Frameworks 

- Collaboration Frameworks for Information Sharing 


University of Illinois at Urbana-Champaign T c 


Figure 5 
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TRECC EDUCATION 


TRECC is an education and training hub to prepare the workforce of today and 
tomorrow, along with the science and technology teachers of tomorrow. The MSTE 
program works with the Technology Center of Dupage in curriculum development for 
teachers that interest students in math and science studies, tools they will need to succeed. 
TRECC offers workshops and training for businesses and technical professionals. The 
knowledge center has four key components. A database, collaborative conferencing tools, 
a knowledge exchange utility and e-Learning environment. 


TRECC Education 


Education and Training Hub 

- Math and Science Teacher Education (MSTE) 

- Vocational Outreach 

- Continuing Education 

- Business and Entrepreneurial Training 

- IT and Technical 

- Knowledge Center 

- Knowledge Base 

- Collaboration Space 

- Knowledge Exchange 

- e learning Environment 


NCSA 


University of Illinois at Urbana-Champaign 


1 /c 
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TECH TRANSFER/COMMERCIALIZATION 


Battelle is responsible for TRECC’s tech transfer and client service activities. 
They work to match available government technologies with suitable businesses or can 
work the other way, bringing private- sector technology to government interests. Small 
business assistance is provided. A clients technology is assessed for innovative value and 
marketability. Business plans are outlined and/or reviewed. Funding and opportunity 
databases are searched for available matches. Clients are assisted throughout the proposal 
process, in preparing paperwork and preparing conduits for progress. 


Tech Transfer/Commercialization 


Identify Suitable DoD Technologies 

- Liaison with Tech, Processes and Systems 

- Match with Partners 
Client Services 

- Technology Assessment 

- Market Opportunities 

- Partnership Support 

- Information Services 

- SBIR/STTR/BAA Database Searches 

- DOD Funding Opportunities 

- Proposal Support 


NCSA 


University of Illinois at Urbana-Champaign 


/c 
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DUPAGE AIRPORT AUTHORITY 


TRECC is located on the third floor of the Flight Center, Dupage Airport, West 
Chicago, IL. Aviators visiting TRECC can scrape their wingtips on the building when 
visiting. DPA is the 3rd busiest airport in Illinois, and 1 1th in the Great Lakes region. 



Figure 8 
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ECONOMIC DEVELOPMENT 


TRECC is the result of earmarked federal funds in combination with local and 
state authorities. TRECC is the first increment in the development of the Dupage 
Research Park, a 1000+ acre development on airport property. Directors of the research 
park board represent the Dupage Airport Authority, Dupage County, and the University 
of Illinois. The research park is expected to include an entrepreneurial technology 
incubator and satellite university campus(es). Neighboring towns have teamed with 
Fermi National Laboratory, which borders the site, to expand and offer broadband 
services to their communities. The development of Global Distributed Technology 
Centers promises regional sites with facilities enabled for crisis management and security 
activities. 



Economic Development 


NCSA 


University of Illinois at Urbana-Champaign 


• Dupage Research Park 

- State appropriated $34M 

- $5M Released 

• Local, State and Federal 
Initiatives 

- Tech Incubator 

- Broadband Services 

• Global Centers 

- Crisis Management 

- Homeland Security 


Figure 9 
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WHY A CENTER? 


Global Centers help get academics and the DoD together and make it easy for 
investors, partners and all involved to understand a technology’s value. Kicking the tires 
of a nascent technology is important as is effective n-way communications of data, 
teamwork and relationships. 


Why a Center? 


Timely and Effective Technology Transfer 

- Get the DoD and Academic Labs Together 

- Understand the “Value Proposition” 

Get People Together 

- Meet, Use and Understand the Technology 

- Collaborate - Physically and Virtually 
Physical Presence 

- Creates an Identity 

- Transportation and Services are Essential 


NCSA 


University of Illinois at Urbana-Champaign 
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THE NATIONAL TECHNOLOGY GRID 


The Alliance, among other research initiatives are continuously creating new 
information infrastructures. This map of the National Technology Grid is an example of a 
continuously changing dynamic. 



A New Information Infrastructure 


• Alliance Partners 
NPACI Partners 
r Supercomputing Projects 
ro Virtual Reality Environments 
PACI Communities: 

□ CIC : SURA ■ EPSCoR 


Internet NOC 
vBNS Connections: 

- DS3 (45 Mbps) 

- OC3 (155 Mbps) 

- OC12 (622 Mbps) 
► OC48 (2.48 Gbps) 


NCSA 


University of Illinois at Urbana-Champaign 
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GRID LINKS PEOPLE 


The grid provides users with specific or widely distributed resources on a national 
scale. One vision of penetration is when consumer ‘grid appliances’ such as your phone 
or an automobile’s information system can access grid resources routinely. 



The Grid Links People with 
Distributed Resources 


High-Speed 

Networks 


Collaborative 

Environments 


Databases 
Mass Storage 


NCgA 


University of Illinois at Urbana-Champaign 
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Computers 


♦ Supercomputers 

♦ Experimental Facilities 


Figure 12 
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CREATING COLLABORATIVE WORK SPACES 


The grid enables people, data, and technology to come together as applications 
i.e.. effective productivity tools. 



Figure 13 
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GLOBAL DISTRIBUTED CENTERS MODEL 


A Global Center is typified by multiple connectivity options, enabling people to 
reliably and diversely communicate, thereby collaborate. 
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ACCESS - DC 


The Alliance Center for Collaboration, Education, Science and Software 
(ACCESS) located near Washington DC (Arlington, VA) is the prototype center. 
Additional centers in addition to TRECC exist or are planned in several locations around 
the globe. 


ACCESS 

Global Distributed Centers 



Washington DC; Mississippi; Illinois; Moscow; Manila 

Alliance Center for Collaboration, Education, Science, and Software 

University of Illinois at Urbana-Champaign T c 


Figure 15 
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ACCESS and the MSCMC 


ACCESS is the host site for the Multi-Sector Crisis Management Center 
(MSCMC). 


ACCESS DC and the MSCMC 


ACCESS 



Mission: to advance scientific research in 
computational science through the creation of 
Global Distributed Centers: 


Explore the development and use of advanced 
technologies 

Foster national and international partnerships 
between academic, government, public, and 
private sectors 


MSCMC 



Mission: Promote global Development 
and Deployment of advanced IT 
Strategies and tools for Crisis 
Management and Emergency Response 
Communities for all phases (planning, 
response, mitigation and recovery) 
including virtual reality environments 
over all modes of communications. 


Accelerate technology transfer 



University of Illinois at Urbana-Champaign T c 
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ACCESS and the ACCESS GRID 


ACCESS has several initiatives that are regional and national/international in 
scope. Here is an state-wide example for Virginia. It stresses multi-purpose use of its 
resources, such as K-12 education assistance in addition to stand-by readiness in cases of 
crisis management. 


ACCESS, MSCMC and the Access Grid 

Model Example 


An ACCESS Centers model offers a Virginia-wide system of Centers connected by a high 
speed network to deliver collaborative government sessions, meetings and events. 


• When not scheduled by the Government for meetings 
and events these Centers can be used for educational 
(K-12) activities to bridge the Digital Divide. 


• These Centers can also be used 
in times of Crisis Management to 
facilitate Emergency Response 
like the TOPOFF exercise 
illustrated. /O 




National Center for Supercomputing Applications (NCSA) http://www.ncsa.uiuc.edu 

Alliance Center for Collaboration Education Science and Software (ACCESS) http://calder.ncsa.uiuc.edu/ACCESS 
Multi-Sector Crisis Management Consortium (MSCMC) http://www.mscmc.org Contact: itt@ncsa.uiuc.edu 


NCSA 


University of Illinois at Urbana-Champaign 


T R X E C C 
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MSCMC ARC NETWORK 


The Access Response Network (ARC) outlines global distributed centers in all 50 
states, including up to four mobile systems in each state. 


MSCMC ARC Network 
Multi-Mode Access Response Centers 


Centers in all 50 States 
Multiple Mobile Systems 
Multi sector Collaboration 

Multiple Use Peace and 
Prosperity Centers 

Emergency Response 
Economic Development 
Life Long Learning 


NCSA 



University of Illinois at Urbana-Champaign 
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TOPOFF EXERCISE 


The TOPOFF exercise was conducted last year. Agencies leased the ACCESS 
facility for a period of days as a test and exercise of their emergency response 
mechanisms. 


ACCESS TOPOFF Exercise 
Homeland Security 



Figure 19 
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TRECC BUILD-OUT 


TRECC started as raw space in the Dupage Flight Center. The ACCESS 
consulting architect along with the TRECC program team was responsible for the 
preliminary design. A local design/build firm provided project architecture and 
construction services. NCSA staff provided equipment plans and deployment, some of 
which had to find alternate means of entry into the building. 


TRECC Build-Out 







6500 SF Raw Space 
Architectural Design 
Equipment Plan 
Contingencies 


University of Illinois at Urbana-Champaign 


Figure 20 
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TRECC FACILITIES 


TRECC has five collaboration areas. A Demonstration area accommodates up to 
40 people, the Training area, approximately 24. Used together in a common gathering 
these areas can accommodate approximately 100. Two Conference rooms seat 
approximately 16 and 8. Studio 6/7 is a demonstration studio for up to 8 people. Staff 
offices may house 10 full-time employees while guest studios are available for short or 
longer-term assignment to visitors, as required. 



TRECC Facilities 


• Collaboration Areas 

• Staff Resources 

• Technological 
Capabilities 


University of Illinois at Urbana-Champaign 
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COLLABORATION AREAS 


The Demonstration and Training areas feature large format, rear-projected 
screens, approximately 18’ wide x 15’ high. Projectors were chosen for their resolution 
and brightness, including flexibility to accommodate emerging and experimental display 
technologies as they become available. 



Figure 22 
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CONFERENCE ROOMS 


All collaboration areas are well-equipped. Each has full Audio/Visual support, 
including, microphones, cameras, and appropriate displays. Power outlets and network 
connections are always close at-hand, including fiber throughout. A Wi-Fi network 
covers the entire facility, providing wireless connectivity for visitors and staff. 



Conference Rooms 


- - 


University of Illinois at Urbana-Champaign 


Figure 23 


267 






STUDIO 6/7 


Studio 6/7 is a demonstration studio, currently showcasing Continuum. 
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CAPABILITIES 


The space, systems and furniture at TRECC are designed for flexible 
configuration to accommodate various groups/events. A broad range of connectivity 
options provide access to computational and collaborative resources. A state-of-the-art 
A/V switching system allows routing of all audio and video signals via touch panel 
control displays or a via a web interface. 
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ACCESS GRID 


The Access Grid (AG) is the collaborative backbone of TRECC and is the people 
part of the grid. AG optimizes group-to-group collaboration, and scales from auditorium 
environments, to the desktop, down to handhelds. 
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AG- 150+ NODES and COUNTING 


The AG display depicted here represents a typical collaborative session. The 
operator chooses from a pallet of video thumbnails available in a particular virtual venue. 
They size and position the windows by dragging them across their multiple screen 
desktop using their mouse. Distributed data and audio streams are included in a full 
duplex environment. 



Figure 27 
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EDUCATION and TRAINING 


The training facility accommodates remote and local curricula in a fully supported 
collaborative environment. A dedicated server supports e-learning application 
environments. 
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HIGH PERFORMANCE COMPUTING SUPPORT 


TRECC’s Equipment Room accommodates networking, computational cluster, 
collaborative and domain servers, with room for expansion. 



Figure 29 
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ADVANCED NETWORKING 


With High Performance Computing, you need Advanced Networking. TRECC 
currently has a dedicated OC-3 (155Mbps) connection to Chicago NAP where we peer 
with MREN and other research networks. We are currently contracting with SBC for 
GigaMAN connection to StarLight, on the downtown campus of Northwestern 
University. Plans are being drawn to expand our connectivity to 10-20 Gigabit in the 
future. 
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GRID CLUSTER 


A 24-processor Linux cluster provides a grid presence for TRECC. Sponsored 
development initiatives include security and grid management applications. 
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LCD DISPLAY CLUSTER 


A 16-Processor Linux cluster drives commodity LCD panels to create a 5120 x 
3840 (19.7M) pixel display for large-scale visualizations in a portable kiosk. 
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CONTINUUM 


Continuum is a collaborative project between TRECC and the Electronic 
Visualization Laboratory (EVL) of the University of Illinois at Chicago to develop the 
hardware and software technology, and user-centered techniques for supporting intense 
collaborations in Amplified Collaboration Environments (ACE). 
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CONTINUUM GOALS 


The goal of research in Amplified Collaboration Environments is to design future- 
generation collaboration spaces that take advantage of emerging advanced computing 
technologies, to allow collaborators that are geographically dispersed to work together as 
effectively as in traditional co-located project-rooms. 
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TELE-IMMERSION 


The immersion module consists of an Access Grid Augmented Virtual 
Environment (AGAVE) passive stereo virtual reality display for visualizing three- 
dimensional data sets. The AGAVE is a low-cost VR system utilizing commodity 
conferencing projectors and a single PC equipped a with modest dual-output graphics 
card. 
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COLLABORATIVE ANNOTATION 


An annotation module supports “whiteboarding” during a collaborative meeting. 
The technology employed is a plasma display enhanced with an touchscreen overlay to 
provide pen-based input. 
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WIRELESS INTERACTION 


The Continuum is controlled via a web interface using laptops and TabletPC’s to 
steer data across a single seamless desktop. Work has begun to enable identification and 
authentication of users to tailor information to ones needs/interests. 
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CONTENT DISTRIBUTION EXAMPLE 


The content distribution module provides a scalable LCD tiled display for 
visualizing high resolution data sets. 
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TERA VISION 


TeraVision is a way to remotely display moving graphics or high-definition video 
over gigabit networks. A basic system consists of a PC server with commodity hardware 
to grab high-resolution VGA or DVI inputs and a PC client to receive and display the 
streams. 
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PARALLEL TERA VISION 


Multiple TeraVision boxes can be used to stream component video streams of a 
tiled display. In a multicast network the streams can be efficiently shared in an N-way 
distribution. 
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PARIS 


Using PARIS (Personal Augmented Reality Immersive System), a projection- 
based “augmented” virtual reality display, a surgeon and medical modeler can view a 
three-dimensional model of a patient’s computed tomography (CT) data, and 
collaboratively review, sculpt and “virtually” build an implant using their hands. 

PARIS is optimized to allow users to interact with the environment using a 
variety of tactile input devices. An authentic sensation of the implant sculpting process is 
achieved using SensAble Technologies’ PHANTOM force-feedback device. The PARIS 
display has excellent contrast and variable lighting that allows a user’s hands to be seen 
immersed in the imagery. 
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THE NIST SMART SPACE & MEETING ROOM PROJECTS 


Pervasive computing devices, sensors, and networks, provide infrastructure for 
context-aware smart meeting rooms that sense ongoing human activities and respond to 
them. These technologies require advances in areas including networking, distributed 
computing, sensor data acquisition, signal processing, speech recognition, human 
identification, and natural language processing. Open interoperability and metrology 
standards for the sensor and recognition technologies can aid R&D programs in making 
these advances. To address this need the NIST Smart Space and Meeting Room projects 
are developing tools for data formats, transport, distributed processing, and metadata. 
We are using them to create annotated multi modal research corpora and measurement 
algorithms for smart meeting rooms, which we are making available to the research and 
development community. 



Prtgm 


"Jjja 




The Next Wave 
of Information 
Technology... 


Slide 1 


289 




NEW INTERFACES - VISIONS AND CHALLENGES 


Visionary system concepts presented at this conference, like the MIT Oxygen 
project, DARPA High Productivity Computing Systems program, the IBM Pervasive 
Computing initiative, and NASA cognitive systems concepts offer approaches to high 
productivity in collaborative aerospace engineering design and development 
organizations. Most of these include sensor based statistical recognition systems for 
speech, speakers, faces, gestures, and even emotional states of users. These can be 
combined into a perceptive interface that has a sense, recognize, understand, and respond 
cycle of operation. The understanding component differentiates a perceptive interface 
from a more traditional stimulus-response perceptual interface. Prerequisite sensor 
systems, such as advanced microphone arrays, to provide adequate signal quality for 
these recognition algorithms are now under development. However, the recognition 
systems needed still have significant error rates and will have to be incrementally 
improved and made more robust with respect to environmental conditions. 
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SMART SPACES - WHAT’S REAL? 


The vision of systems that can respond to spontaneous speech that is inferred 
from existing commercial large vocabulary dictation systems is optimistic. While they 
do perform adequately for highly codified technical speech, say radiology dictation by a 
physician, they do not yet provide good recognition performance for spontaneous speech. 
This fall, NIST conducted an evaluation of a state-of-the-art large vocabulary speech 
recognition system, and found that it missed two words out of three for spontaneous 
person to person speech in meetings. This area still requires additional investigation and 
development, with others such as face recognition, under actual field conditions, having 
similar performance problems. 
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NIST SMART SPACE AND MEETING ROOM PROJECTS 


We address issues of data acquisition using the NIST Smart Data Flow System 
system, which is a set of tools that allow components from various developers to 
interoperate in an environment containing flows from many sensors, and offering a 
reference implementation for laboratory use. The NIST Meeting Room has over two 
hundred microphones, five cameras, a smart whiteboard, and will soon have a locator 
system for the meeting attendees. In the aggregate, these generate over a gigabyte per 
minute of sensor data, which are time tagged to millisecond resolution and stored for 
research uses. We address broad issues of metadata, or annotations, with semantic 
descriptions using the Architecture and Tools for Linguistic Analysis Systems (ATLAS). 
One of the major design features of ATLAS is standardization of metadata derived 
directly from the sensor data streams, and subsequent higher-level annotations of meeting 
context, which may allow indexing, transcription, and possibly even summarization of 
meetings. Some significant meeting metadata under investigation include spoken words, 
speaker identity, sentence-like units, disfluencies, speaker locations, and time tags. 
From these low-level metadata, smart spaces will have to make higher-level inferences 
about tasks the users are undertaking to become context-aware. 
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ON THE WAY TO UNDERSTANDING... METADATA RICH TRANSCRIPT 


Perceptual interfaces will allow smart meeting rooms to act as meeting secretary 
by taking meeting minutes from the chairperson in response to commands. The NIST 
Rich Transcription Evaluation series, which began in 2002, seeks to support the 
development of these technologies. A raw machine generated transcript, XML metadata 
enrichments, and human-readable form are shown here to clarify the importance of such 
annotation capabilities. Future cognitive and perceptive interfaces will have to extract 
meaning from word streams generated by speech recognizers. An initial challenge will 
be creation of rich transcripts that are easily readable by humans. This will require 
automated processing of speaker terms, named entity tracking, and later, even topic 
identification. We believe that a long term program providing standard reference 
materials, metrics, and algorithm evaluation will be needed to enable the creation of the 
usable and facile multi modal interfaces of the future. 
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SMART SPACES - WHAT’S REAL? 


With current technology can acquire speech at a distance using a phased 
microphone array, and perform speaker dependent speech recognition. This currently 
requires a skilled and cooperative speaker, and discourse in the domain of the language 
model. Recent work shows that it is possible to apply a GMM based speaker verification 
algorithm to the cepstral coefficients used for speech recognition to also identify the 
speaker. This will allow near real time speech recognition for a privileged user in a 
meeting space. The recognized speech can be used to transcribe meeting minutes, or to 
voice commands to the system. This may allow useful capabilities to be deployed long 
before spontaneous speech and natural language understanding emerge. 
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NIST MEETING ROOM DATA COLLECTION FACILITY 


A schematic plan view of the layout and sensor arrangements in the NIST 
Meeting Room Laboratory is shown above. It is a sensor rich environment which 
provides many views of the meetings using twenty-four random placement microphones, 
three linear microphone arrays, five camera views, and an electronic white board. We are 
currently developing enhancements to this facility. One being our Mk-III microphone 
array, which offers improved signal to noise ratio, and onboard conversion of the data to 
UDP/IP packets for direct transport across fast Ethernet. Also, an additional sensors for 
meeting participant locations using smart badges is being added. The NIST Meeting 
Room project currently uses this facility to record meetings of small groups, and offers 
the data to research and development communities. 
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MULTI MODAL MEETING RECORDING 


We have recorded meeting room data for industry and academic research and 
development groups. This consists of twenty hours of meeting data, at more than seventy 
gigabytes per hour. These meetings had various subjects including focus groups, game 
playing, expert interviews, and planning meetings. They varied in length from fifteen 
minutes to one hour, and had from three to eight participants. This data will be made 
available through the Linguistic Data Consortium. 
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CONNECTIVITY TOOLS NEEDED... 


The NIST Smart Data Flow System was developed in response to the need to 
provide connectivity to the large number of sensors and devices that will be needed to 
construct smart meeting rooms, and perceptual interfaces. An operational flow graph for 
data review in the NIST Meeting Room is shown above. The Smart Data Flow System 
generates the connections and transports the data among the clients represented by the 
blocks. Dragging and dropping the components and naming of the flows can be used to 
reconfigure the application flow graphs. The system consists of a defined middleware 
API for real-time data transport, and a connection server for sensor data streams. 
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METRICS RESEARCH NEEDED 


Significant research is required in order to construct measurement protocols for 
mixed initiative systems that allow multiple actions, choices, and responses. Experience 
in the speech recognition community showed that well drawn measurement programs are 
very important to ongoing technical improvements in a new technologies. Reference data 
sets will be needed for the several recognition tasks required in Smart Spaces, and Smart 
Spaces can provide a test bed for integrated functionality of the technologies. 
Measurement tasks for multiple cascaded technologies will have to be designed. 
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MEETING ROOM DATA COLLECTION LABORATORY 


The NIST Smart Space employs a combination of software and hardware to 
provide a test bed for sensor based interface components. It provides data acquisition, 
archiving, time tagging, and data transport for the recognition components that will make 
up the multi modal collaborative interfaces envisioned in this conference. Some 
examples of possible technologies are listed above. 
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NIST SMART DATA FLOW MIDDLEWARE 


The NIST Smart Data Flow System is a proposed reference implementation of 
data transport and format standards for sensor intensive interfaces. Our prototype has 
been tested in the NIST science programs. Such systems must be distributed across 
numerous nodes, provide standardized data transport mechanisms and data types, but 
allow for more to be defined. They must also abstract connectivity mechanisms and 
support device/service discovery, be fault tolerant, and allow mobile nodes to come and 
go from the environment. 
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MULTI MODAL PROCESSING 


Some examples of visual interface processing might include face localization 
using skin color detection, face normalization using reference points like eye locations, 
and gesture recognition. Acoustic interface components, based on phased array 
processing, include source location, speech recognition, speaker identification, and sensor 
fusion with visual data. 
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USABILITY FEATURES NIST SMART DATA FLOW SYSTEM 


We have worked with several laboratories to understand problems encountered 
when building sensor intensive smart spaces. We found that our users are interested in a 
more streamlined capability. We are forming an industry working group to work on 
further definition of standards that will facilitate integration and testing of the many 
technology components necessary to implement the advanced mixed initiative systems 
envisioned for the future. 
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NIST SMART DATA TRANSPORT ABSTRACTION FOR 
BUFFERED REAL TIME CONNECTIVITY 


The NIST Smart Data Flow system provides an abstraction for connectivity and 
data buffering to facilitate the construction of the needed multi process, distributed, 
systems. The pointers to data flows “in” and “out” can reside on remote systems with the 
connectivity being defined at the graphical level. This allows data flow component 
libraries that integrate various real time signal processing and recognition capabilities to 
be defined and used in a variety of flow graph contexts. Substantial code is generated by 
the data flow middleware to support this simplified application structure. 
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NIST MARK-III MICROPHONE ARRAY 


NIST has been involved in developing spoken language corpora to support the 
training and testing of large vocabulary speech recognition systems. These data sets 
progressed from a one-thousand word structured task, to five-thousand word vocabulary 
readings, to twenty-thousand word readings, to recorded broadcast news programs, and 
most recently to spontaneous speech in small group meetings. The meeting room 
laboratory offers multiple views of speech at progressively greater distances with close- 
talk microphones at one or two inches from the lips, lapel microphones, table top 
microphones, and wall mounted microphone arrays. We have distributed this technology 
to interested research and development laboratories, such as the Georgia Institute of 
Technology Aware Home project, shown above in the New York Times on April 5, 2001. 
The array technology shown is of the Mark-II series, which was our first digital phased 
array. 
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THE MARK-III MICROPHONE ARRAY 


An operational prototype of the Mark-Ill series microphone array is now 
completed, and has important advantaged in terms of manufacturability, signal quality, 
price, and deployability in research environments. We are making the construction data 
available to interested research and development laboratories. This mark uses twenty- 
four bit analog to digital conversion, an Ethernet based data interface, and a local field 
programmable gate array to read the ADCs, and create TCP/UDP frames that are sent 
them to other Smart Data Flow Nodes. This hardware architecture consists of a 
motherboard which reads digital interfaces from each of eight daughter cards with eight 
microphones. This mark has improved signal to noise characteristics due to the short 
analog signal runs, and improved ADC resolution and noise floor. 



nii-ii n iBfr) S 




■/La Phifk-m 
Ptizf'jpiiona Ami] 

JiliBSjfsJ'iBL ]/ JisM y tu PBpJjC'JBB 


■ r 
sts from 


Responds to 


Smart Data Flo 


Slide 17 


305 







SMART SPACE PROTOTYPE TECHNOLOGIES 


Our Meeting Room apparatus consists of microphone arrays acquiring sixty-four 
channel audio input flows and offering them for subscription. A beamformer subscribes 
to these, reduces them to a single channels, offering audio flows. Many sensor and 
recognition technologies are under development in industry, so interoperability and 
integration issues are crucial to new generation smart environments. The NIST Smart 
Data Flow System is being used to integrate technologies including: speech recognition, 
speaker identification, face, localization and recognition, channel normalization, video 
and acoustic displays, and wireless PDAs. Standardized formats are offered for 
multimedia data streams, archiving, retrieval, and review tools. Hand crafting the needed 
inter-process communication was found to be very labor intensive and brittle with respect 
to changing requirements for new sensors and configuration changes to accommodate 
equipment faults. The NIST Smart Data Flow System toolkit has components for 
graphical configuration of flows, allocation of the graph nodes to distributed systems, and 
connection by TCP/IP/UDP. Data transport code is provided by the Smart Data Flow 
System libraries. We hope to make this the basis of a standards working group and 
collect additional requirements from industry and work cooperatively to develop 
reference implementations for smart meeting and multi modal interfaces. 


TZC/J/JrjJOfjlW, 


ototype 


j riteri .frjiad 1 

- AjCpl 

- I H--Ar i h l* 


orients: 


Jon 


nition 


- sjBiun funu m 


r data acquisition: 


Source localization 
AcousticVvideo .sensor fusion 


■ Large scale data collection for smart space R&D 


Slide 18 


306 




SENSORS WILL ALLOW PERSONAL INTERFACES 


The multi modal, recognition based, interfaces of the near future will allow 
personalized interfaces to respond to selected individuals, maintain user profiles and 
session histories to provide some degree of context awareness. This will enable the 
interactive, mixed initiative, project design and management environments envisioned in 
this conference. 
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VISION OF THE POSSIBLE: A MEETING ROOM THAT... 


We are proposing an integration challenge to the providers of the statistical 
recognition software and other relevant components. It is designed to be at the edge of 
the current state of the art. It would consist of a meeting room using microphone array 
technology that is sensitive to a meeting chairman in particular over the other meeting 
participants. This will require integration of several technology components in cascade 
and parallel to provide the necessary signal acquisition, conditioning, preprocessing, 
recognition, and responses. Such a multi modal system could also serve as the 
foundation of accessible computing, with standards for identifying user preferences and 
needs, and protocols to communicate them to host smart environments. 
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ACCESSIBILITY PROTOTYPE: HANDS FREE SERVICES 


A near term, and humane, use of the technology involved in the integration 
challenge would be to provide accessible computing. For example if a user wants, or 
needs to operate the computing, meeting, and presentation environment hands free. This 
will require user and service discovery, using dynamic networking and appropriate 
security safeguards. An initial prototype includes a PDA with 802.11 wireless 
networking that negotiates for services, communicates user preferences, and uploads 
personal profiles for speech, speaker, and possibly other recognition training data. 
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PDA INTEGRATION FOR ACCESSIBILITY EXPERIMENTS 


A prototype device discovery protocol includes wireless PDAs with 802.11 
networking, and uses services including, DHCP, HTTP, and CGI, as well as INCITS-V2 
protocols to communicate user preferences, and to initiate the required multi modal 
interface services and applications. 
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PERSONALIZED USER INTERFACES: USER DISCOVERY 


The device/service discovery protocols will allow user preferences like hands- 
free, eyes-free, and ears-free operation to the multi modal service environment provided 
by smart spaces. The NIST Smart Data Flow System can be used to integrate real time 
services that support speech recognition, activate screen readers, or closed captioning as 
specified in user preferences. Other service graphs can be defined and used to respond to 
additional preferences as the technologies emerge. 
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EXAMPLE: SPEAKER ID FLOW GRAPH 


An example of a distributed flow graph that could provide some components 
needed is shown above. It captures data from a NIST microphone array, and sends it to a 
source bearing estimator, a beam former, and a computer controlled camera that can point 
to a speaker. This graph is currently operational in the NIST Smart Space Laboratory on 
an experimental basis. Additional components could be integrated that make use of the 
video, and audio services available in the existing smart space framework. 
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WHAT CAN NIST DO FOR THIS COMMUNITY? 


NIST is interested in aiding U.S. industry through the use of measurements and 
standards. Our Smart Space and Meeting Room projects offer metrology, reference data, 
and proto-standards for data transport and formats. We can also participate in standards 
working groups and publish non-regulatory standards in aid of industry groups to 
promote interoperability. 
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MEASUREMENTS AND STANDARDS WILL BE KEY... 


To summarize: we believe that the development of the advanced cognitive 
interfaces discussed at this workshop can be facilitated by standardization and 
performance metrics. We would like to discuss the matter with interested parties in the 
research and development communities in industry, academic, and government 
laboratories. 
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GRID COMPUTING INFRASTRUCTURE 


Geoff Brown 
Oracle Corporation 
Redwood Shores, CA 
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THE IT CHALLENGE 


Your IT department is under constant pressure. You have to implement, maintain 
and improve the operational systems that run your companies and also to design and 
create additional systems that can provide competitive advantages for your business. 
These systems could be used for a variety of purposes, from deeper analysis of market 
and business trends, to an improved customer service experience, to reducing overall 
product costs. 


The IT Challenge 


• Competitive Advantage 

• Operational Systems 
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THE IT RESPONSE 


To succeed, you must try to meet this challenge, but also deliver even greater 
value. One effective way to accomplish this is to reduce the portion of the budget 
required to meet operational costs, which allows you to use more of your resources to 
provide competitive advantage. And you would like to accomplish this while saving both 
money and time to market. 


The IT Response 
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Infrastructure Costs 

Low Utilization of Processor Resources 

Low Utilization of Storage Resources 

Weak Systems Management Capabilities 

Weak Asset Management Capabilities 

High Cost 

Slow Provisioning 

Inadequate SLA’s 

...and It is Only Getting Worse! 
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ROBUST AND FLEXIBLE INFRASTRUCTURE 


You can reduce your overall costs with a powerful, robust and flexible 
infrastructure. Rather than having the complexity of your infrastructure be a budget 
consumer, taking time and resources away from your overall budget, you could choose 
the right infrastructure and have your choice reduce your total cost of ownership while 
increasing the productivity of all of your IT staff. As Nick Gall of META Group says, 
the better the infrastructure, the greater the benefits. 

The main way an infrastructure can provide value is through increased 
productivity. The more functionality your infrastructure supplies, the less time you will 
have to spend implementing and maintaining that functionality in your IT systems. 
Providing productivity is half the equation. If you have to spend the same amount of 
time implementing a feature in your infrastructure that you would in your application 
systems, the net benefit is zero. The easier it is to obtain a benefit, the greater the overall 
value. 


Another important aspect of a standardized IT infrastructure is that you can use it 
over and over without any additional implementation work. An infrastructure that can 
provide ongoing benefits from your original investment will provide the greatest value 
for your organization. 


“The more robust the infrastructure, 
the greater the total benefit... 

...Flexible infrastructure produces greater 
productivity and lower TCO across the 

stack.” 

-Nicholas Gall 
Vice President 
META Group 
2002 
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STANDARDIZED INFRASTRUCTURE ELEMENTS 


To achieve these goals, your business needs an Unbreakable Software 
Infrastructure. Oracle’s Unbreakable Software Infrastructure provides a wealth of 
functionality. The advanced features of Oracle’s Unbreakable Software Infrastructure 
can help to solve your tactical business problem today, as well act as a strategic 
investment in the future of all your IT systems. 


Standardized Infrastructure 
Elements 



Access 
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STANDARD SOFTWARE INFRASTRUCTURE 


The core of the Oracle Unbreakable Software Infrastructure is information - your 
data. Your data is one of the most valuable resources of your company. Virtually all of 
your information systems are built on your data. By making data the core of an 
Unbreakable Software Infrastructure, you are building on the core of your company’s 
valuable information. 

With Oracle’s Unbreakable Software Infrastructure, you can keep all of your data 
in one centralized repository - data from your transactional (OLTP) systems, data used 
for business intelligence functions, and a wide variety of other documents, such as Web 
content, E-mail, and calendar and resource scheduling information. 

This centralized repository reduces your overall management overhead and allows 
you to consolidate the number of servers in your organization, which will further reduce 
overhead. By having all of your data in a single repository, you also reduce the need for 
resource-consuming data transfers required for multiple uses of the same data. 


Standard Software Infrastructure 



Access 


Central Repositories Manage All Your Data 
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BENEFITS OF CENTRALIZING DATA 


The advantages built into Oracle’s Unbreakable Software Infrastructure provide 
transparent benefits for all your data. For instance, Oracle’s Infrastructure gives access to 
all data for large numbers of users, without any performance impediments caused by 
locking issues or extensive coding to work around potential problems. Oracle’s 
Infrastructure can scale up or out, seamlessly, which guarantees you both scalability and 
the cost benefits of using commodity hardware. Oracle lets you add the widest variety of 
indexes to all of your data, which can in turn provide rapid access. You can even define 
your own custom indexes for your own specific data. Oracle9i gives you a new feature 
which will automatically compress the stored representation of your data, saving you 
storage space and improving the performance of your application systems and 
maintenance operations. And Oracle’s Unbreakable Software Infrastructure lets you 
separate any and all of your data into partitions in many different ways - for 
maintenance, security or performance considerations. These benefits are all available to 
all your applications - without any additional coding or maintenance on your part. The 
cost of these benefits is zero. 


Benefits of Centralizing Data 


Better Performance 
Scalability 

- Lots of data 

- Lots of users 
Reduced Hardware 
Easier to Manage 


ORACL€ 



Figure 7 


322 



STANDARD SOFTWARE INFRASTRUCTURE 


Data is the core of our Unbreakable Software Infrastructure, but information 
systems do more than simply store and retrieve data. Systems are built to interact with 
data to create business processes used to support and enhance business operations. 

Oracle’s Unbreakable Software Infrastructure helps to create and deploy your 
business processes. The advantages of Oracle’s Infrastructure help you to create business 
processes quickly and efficiently to respond to the demands of your environment. 


Standard Software Infrastructure 



Access 


Application Infrastructure 
Manages Business Flows 
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BENEFITS OF STANDARDIZED APPLICATION ARCHITECTURE 


Oracle’s Unbreakable Software Infrastructure includes features that can make it 
easier for you to create and maintain your vital business processes. You can create data- 
aware Java components, pre-baked with all the functionality they will need to access and 
manipulate data. 

Our Infrastructure includes transparent caching for not only data, but the results of 
processes, such as HTML pages or fragments. Retrieving cached data is much faster than 
recreating it, and your application systems will perform better - transparently. Oracle 
includes tools to easily manage the way you use this caching. 

Oracle includes a special feature to pre-calculate aggregate values, which are 
frequently used in data warehousing. Of course, you can use this capability without any 
modifications of any of your applications. Oracle’s Infrastructure even includes wizards 
to suggest which pieces of data could benefit from this type of pre-calculation. 

Your own business processes are unique to your own business situation. That’s 
why Oracle lets you create your own functions, which you can use in any application or 
SQL code, just like standard built-in functions. The productivity gains provided by the 
Unbreakable Software Infrastructure can extend into the particulars of your own specific 
business. 


Benefits of Standardized 
Application Architecture 



Information 


Access 


Rapid Development 
Rapid Deployment 
Quicker Integration 
Less Cost 
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STANDARD SOFTWARE INFRASTRUCTURE 


Processing is not the end goal of any information system. You also have to 
deliver the results of those business processes to your clients across a wide range of 
channels. Using a single business process across many channels can significantly 
improve the productivity of your development effort, as well as reduce the need for 
redundant systems that require constant maintenance and synchronization. 

The final step of data access is also where an infrastructure becomes truly 
Unbreakable with efficient and powerful security mechanisms. 


Standard Software Infrastructure 



Secure Multi-Channel Access 
For All Your Users 
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BENEFITS OF MULTI-CHANNEL ACCESS 


You can propagate the results of your business processes to multiple channels, 
without having additional logic or redundant applications to address the needs of each 
individual channel. Whether the final destination of the information generated by a 
process is a standard client machine, a Web page, a portal or a mobile device, Oracle’s 
Infrastructure provides easy support for each channel. 

To make it easier for your users to access the data they need, Oracle’s 
Infrastructure provides powerful search capabilities. 

Oracle has been a leader in secure access for many years. Built into our 
Unbreakable Software Infrastructure are features that can provide a single digital identity 
for all applications, so that your users only have to log on once a day. 

Oracle has extremely flexible security, which allows you to limit access to data 
based on the value of the data. For instance, one column in a table could have a value 
that is used as a label to allow or prevent access to the information in that row. You can 
implement this content-based security on the data itself, so it will apply for all systems 
that access the data. 

Oracle provides encryption of your data as it is stored and in transit, as well as 
selective encryption if needed. 


Benefits of Multi-Channel Access 


All Users, Anywhere, Any Time 
Rapid Access to Information 
Better Security and Privacy 
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UNBREAKABLE SOFTWARE INFRASTRUCTURE 


Oracle’s Unbreakable Software Infrastructure excels in 5 areas crucial to the 
value of any infrastructure - performance, scalability, availability, security and 
manageability. You will be seeing examples of customers and independent proof points 
for Oracle’s leadership in each of these areas throughout the day today. 


Unbreakable Software Infrastructure 


Performance 

Scalability 

Reliability 

Security 

Manageability 


ORACLE 



Figure 12 


327 




EVOLUTION OF COMPUTING 


Of course there have been other IT infrastructures in the past. In the Age of Big 
Iron customers used mainframes to run their infrastructure. These had significant 
advantages in quality of service and efficiency. But they were also inflexible, leading to 
large application backlogs, and costly. 

Client-Server computing arose in response to this. This swung the pendulum to 
the opposite pole by highly distributing systems. While this reduced the initial purchase 
price of systems and provided greater flexibility, it also cost more in integration and 
quality of service problems. 

Next generation infrastructures balance these centralized-decentralized designs by 
gaining the advantages of consolidation while retaining flexibility in application design. 


Evolution of Computing 


Next 

Generation 


• Standards • Best-in-Class Tech 

• Metrics • Many choices 

• Quality • Few standards 

• Control • Low Quality 

• High Utilization • Complex Integration 

• Upfront cost * Islands of Apps 

• Proprietary • Islands of Platforms 
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• Standard components 

• Active Management 

• Self Healing Services 

• Auto-provisioning 

• Virtualized Environment 

• Service Level Agreements 
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NEW INFRASTRUCTURE DESIGN STRATEGY 


This new infrastructure model is based on a new design strategy based on a few 
simple principles. Build a single infrastructure for your entire IT department. Think 
holistically about data management, application processing, networking and user access. 
Create a few large pools of resources that can be used across all applications, not 
individual islands of systems. 

Standardize on a few pieces of infrastructure software - databases, application 
servers, etc. Best of breed technology is not cost-effective. Enforce the adoption of 
architecture standards for all your applications. Make your applications take full 
advantage of the standardized infrastructure over time. Ensure that you have a 
comprehensive end-to-end system management solution for your infrastructure. You 
cannot scale up your infrastructure without resolving this issue. 


Grid Computing 

Persistent environments that enable 
software applications to integrate 
instruments, displays, computational 
and information resources that are 
managed by diverse organizations in 
widespread locations. 

-- The Globus Project 
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New Infrastructure Design Strategy 

Attack the Whole Problem 

- Data, Applications, Users 
Virtualize Resources for Flexibility 
Standardize Infrastructure Software 

- Best of Breed is not necessarily most cost- 
effective 

Enforce Application Architecture Standards 
Comprehensive System Management 
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Infrastructure Design Elements 



Distributed 

Computing 
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TECHNOLOGY WILL HELP 


There are numerous technical breakthroughs which have made this possible 
including low cost computing components, new clustering designs that enable a modular 
approach to system design; web services and other integration techniques; and 
improvements in self-managing computing systems that enable greater scale up. 


Technology Will Help 

Low Cost Components: Intel, Linux 
Blade Hardware 

Virtualized Processing, Storage and I/O 
Web Services 
Autonomic Computing 
-Automated Infrastructure Management 
-Self Healing Infrastructure 
-Architecture Patterns 
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Figure 18 


Standard High Volume 
Components 

IA-64 

- Low cost, high performance server processor for 
HP, IBM, Dell, etc. 

Linux 

- Low cost server operating system 

New Fabrics and Faster Interconnects 

- Infiniband, Ethernet, etc. 
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Virtualized Data Center Resources 


Centralized Pool of Resources 

- Storage 

- Processing 

- I/O 

Resources can be Isolated and Dedicated 
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What’s the Problem with Storage? 

Islands of Storage 

Storage Tightly Coupled to Applications and 
Servers 

Storage Utilization is often < 50% 

Storage Administration Costs are Sky 
Rocketing 

...Storage is growing 30%+ per year 
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STORAGE VIRTUALIZATION 


With storage virtualization we eliminate the islands of storage. By consolidating 
and virtualizing we can dramatically reduce waste and inefficiency. 


Storage Virtualization 


Yesterday 
& Today 


SANs Today 


Block 

Virtualization 



Servers directly 
attached to disks 

Servers “mapped” 
to physical disks 

Servers mapped 
to virtual disks 
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Storage Virtualization Benefits 

Single File Systems with One Name Space 
Common Point of Management 
Heterogeneous File Sharing 
Policy-Based Automation e.g. File Placement 
Local File Systems Performance 
Increase Storage Admin Productivity 
Improve Capacity Utilization 
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PROCESSOR VIRTUALIZATION 


Through these blades we can create a huge pool of computing capacity available 
on demand. There are also new partitioning capabilities being built into SMP systems 
that enable sharing of resources. This is another approach that appeals to customers. 


What’s the Problem with Servers? 

Islands of Servers for Each Application 
Too Many Independent Servers 
Average CPU Utilization is Low (< 25%) 
Multiple Unique Vendors and Versions 
Complex and Different Software Stacks 
Poor Systems Management 
Slow Provisioning 

... the More Servers Added, the Lower the 
Utilization of Assets. 
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Processor Virtualization 


Blade Servers 



Hundreds of Intel 
processors in a single 
rack. 

Sophisticated 
management tools 

Self-healing 

capabilities 

Excellent for Web 
Servers, Clusters, etc. 


Virtual Partitioning 
Servers 



Large multi-processor 
systems. 

Physical and virtual 
partitioning 

Excellent for 
consolidating servers 

Dynamic CPU 
utilization 
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VIRTUALIZING THE DATA CENTER 


This diagram shows how a data center moves from separate islands of resources 
today to virtualized pools of resources. 



Figure 26 
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UTILITY COMPUTING EXAMPLES 


CSFB: The BladeFrame gives CSFB flexibility. Provisioning and configuration 
tasks that require three weeks or more with legacy servers are performed in just minutes 
on the Egenera system, enabling us to accommodate growth and launch new applications 
in a timeframe never before possible. Simplifying server deployment also allows 
developers to focus on strategic initiatives, which means they can respond more quickly 
to business opportunities. The BladeFrame helps CSFB adapt to change faster, giving 
them a powerful advantage in a highly competitive market. 

CDC IXIS Capital Markets: The financial modeling CDC IXIS Capital Markets 
uses to predict the outcomes of particular events provides the entire underpinning of their 
operation. They use a sophisticated application that is able to model a number of 
parameters and help them to predict the most likely outcome of a position. The key is to 
ensure that every aspect of what might happen in each market is covered, and to be able 
to report the effects of changing parameters rapidly to the customer. To handle that 
volume of complex modeling calls for a great deal of processing power, but no storage. 
CDC IXIS Capital Markets uses a server farm with NEBS Level-3 certified Netra[tm] tl 
Model 100/105 servers from Sun Microsystems, Inc. to run the financial modeling 
application. 


Utility Computing Examples 

Oracle 

- Solaris blades automating RDBMS development 

Credit Suisse First Boston 

- Egenera blade farm supporting financial and web 
applications 

Department of Energy 

- 1 ,400 Linux blades to study materials design 

Celera 

- 1000 Alpha nodes analyzing human genomic data 
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ORACLE’S EXPERIENCE 


Oracle has also moved to a Utility Computing model for some of its internal 
application development 


Oracle’s Experience 

Internal Testing of Oracle Database Software 

- Many engineers running separate test programs 

- Need rapid turn around on results which means 
peak workloads for different engineers at different 
times 

Pooled Computing Infrastructure 

- Resources in two different geographic locations 

- Servers running regression testing around the 
clock 

Faster Throughput, Lower Cost 
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AUTONOMICS 


Even as companies move to a utility computing model, the issue of resource 
management becomes critical. As you build larger and larger computing pools, human 
beings become challenged to manually manage these resources effectively. The answer 
is to build systems that manage themselves. 



Figure 29 
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WHAT’S THE PROBLEM WITH SOFTWARE INFRASTRUCTURE? 


So we’ve talked about the movement to low cost, high volume hardware and then 
pooling this hardware to create a large flexible resource for all your applications. The 
next problem is the software infrastructure. Today most customers use multi-vendor 
solutions. This complexity diverts focus from the business requirements 

Data is ignored 
Politics rule 

The IS organization and business users can't work together 
There is no plan 

Processes are implemented for the enterprise, not the customer 
A flawed process is automated 
No attention is paid to skill sets 

The key is to standardize the infrastructure software so that there is an easy way 
to install, maintain and upgrade your infrastructure. Applications can be written to use 
this infrastructure in a consistent fashion. All this means gains in efficiency. 


What’s the Problem with Software 
Infrastructure? 


Networks Application 

Servers 

Web Servers 

Mobile 

Computing Directory 

Security 

Databases 

Storage 

Management 
Messaging Tools 

i Wtr 

Portals 

Business 

Intelligence 


Too many incompatible pieces 
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WHAT’S THE PROBLEM WITH SYSTEM MANAGEMENT? 


Expensive to operate 

Manual labor intensive deployment and changes 
Inefficient asset utilization 
Weak systems mgmt 

Inflexible and complex 

Multiple architectures for apps and customers 
Highly complex because they are all different 
Integration is complex and costly 

Error prone, unreliable and slow 

Human factor in change requests 
Limited high availability built for specific apps only 
Lack integrated management 

...how are we going to manage our environment in the next decade? 


What’s the Problem with System 
Management? 

Expensive to Operate 

Inflexible and Complex 

Error Prone, Unreliable and Slow 

... How are We Going to Manage Better? 
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CHALLENGE: WORKLOAD MANAGEMENT 


The second challenge is how to deal with the allocation of resources within the 
pool to various applications. It’s not just meeting the processing demands of the 
applications but also determining appropriate HA strategies for each application, dealing 
with spikes in demand that randomly occur, as well as figuring out future capacity needs. 

This can be far beyond the ability of humans to handle when you’re dealing with 
a huge resource pool. 


Challenge: Workload Management 

Installation, Configuration, Backup/Restore 

Partition/Control Short-Term Load Among 
Nodes 

Allocation of Nodes 

- HA spares 

- Handling spikes 

- Integrate capacity planning w/ growth as needed 

Support for Necessary User Choice 

- Multiple OS in the same rack (Windows, Linux, 
Proprietary UNIX) 
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SOLUTION: END-TO-END SERVICE LEVEL MANAGEMENT 


So the goal of autonomies is the enable the infrastructure to manage itself with 
minimal intervention by human beings. People will still set high-level business and 
technical policies about the infrastructure but the system itself will do installations, 
maintenance, tuning, recommend capacity plans and so on. All of this will be reported 
through comprehensive graphical displays. 

Single system image for competing workloads running within multiple server 
farms Automated management of workload groups in response to service metrics One 
event system across and between stateless and stateful cluster domains. One big happy 
cluster: no mid-tier/backend distinction. Mixed storage, flexible mapping services to 
nodes 


I’m not going to talk about customer successes in this section. All customers are 
taking advantage of autonomies to some degree today. Autonomies have been 
incorporated into software for some years and will continue to be refined in coming 
years. Everyone is and will continue to use these capabilities without really having to 
know much about it. 


Solution: End-to-End Service Level 
Management 



Figure 33 
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DISTRIBUTED COMPUTING 


The final aspect of this new infrastructure is Grid Computing. Grid computing is 
a set of technologies that take into account the fact that not all applications and all data 
will necessarily reside in a single resource pool. There are many times when resources 
are distributed and applications and users must access these resources in a distributed 
fashion. 


Distributed Distributed Applications 

Computing Distributed Data 

Distributed Users 
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DEFINING DISTRIBUTED COMPUTING 


Grid computing provides standard interfaces between resource pools so that users 
and applications in one location can access resources in another location transparently. 

The Grid is not an alternative to the Internet. It is a set of additional protocols and 
services that build on Internet protocols and services to support the creation and use of 
computation and data-enriched environments. 

Grid computing is not web services. Web services are one technology for 
implementing distributed applications, but there are many more technologies involved in 
Grid computing. 

Grid computing is not peer-to-peer computing. P2P is one style of distributed or 
Grid computing that relies on systems in place for a limited class of parallel processing 
applications. 

Grid computing is also not the same as application hosting. Application hosts 
may use Grid Computing technology to make their resources available, but app hosting is 
more about the business issue of outsourcing work than about distributed computing 
technology. 


Defining Distributed Computing 

Network of Clients and Service Providers 

- Standard interfaces and universal availability 

- Resource sharing, fault tolerance, and load 
balancing 

What It’s Not 

- Next Generation Internet 

- Just Web Services 

- Just Peer-to-Peer Computing 

- Application Hosting 
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GRID ARCHITECTURE 


A Grid architecture consists of different applications using a meta-reservation 
service to schedule run-time. The scheduler uses a resource management protocol to 
identify appropriate computing resources around the network to run the applications. The 
computing resources need to advertise their characteristics such as types of applications 
they can run, databases they have access to, etc. 


Grid Architecture 


Application Application Application App Layer 
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WHY SHOULD YOU CARE? 


There are some important benefits of Grid computing. You can now create 
gigantic applications that otherwise would not have the resources to run. You can access 
data that would otherwise be unavailable. 

This is extremely valuable for many day-to-day applications. Many companies 
need regular access to remote data sources such as consumer credit information, medical 
or genomic databases, and more. These can now be made a transparent part of any 
application that is granted access. This means more real-time availability of information 
and better applications. 

Grid computing can also solve problems with mergers and acquisitions. When a 
company spins off a subsidiary it is often difficult to immediately separate ERP or CRM 
systems. Grid computing can provide access to existing company applications to the 
former subsidiary while maintaining real separation of information, proper billing for 
usage, etc. 

Grid computing can also be attractive for companies that have many suppliers or 
distributors that need to share information about manufacturing processes, inventories, 
etc. By having appropriate access to the data warehousing information from your supply 
or distribution chain you can do much better data mining or simulation work. 


Why Should You Care 

Completely Changes the Economics of 
Computing 

- Drastically lowers cost 

- Extends availability of resources 

Valuable for Many Organizations 

- Accessing external data sources - credit bureaus, 
genomic or medical databases, etc. 

- Mergers & acquisition situations 

- Large supply or distribution chains 
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STANDARDS EFFORTS 


Grid computing is gaining momentum. 


Standards Efforts 

Global Grid Forum 

- Provide standards specifications for grid 
technologies 

- Comprises over 200 organizations 

- Oracle UK co-chairs the Data Access and 
Integration working group 

Globus 

- provides open source toolkit conforming to Global 
Grid Forum specifications 
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DISTRIBUTED COMPUTING CUSTOMER EXAMPUES 


Gene Logic chose the AVAKI technology in order to maximize the use of its 
existing internal computing infrastructure, which has cyclical usage characteristics. The 
AVAKI technology will be employed to create more efficient utilization of this 
infrastructure, and more rapid completion of certain internal analysis efforts. 

Deutsche Bank is one of the leading international financial service providers. Its 
investment banking division relies heavily on technology to meet the computing needs of 
its traders around the world. While traders at Deutsche Bank’s New York office use high- 
end Pentium desktop and UNIX workstations to effectively carry out their daily 
functions, end of day reporting and analysis, the need for optimal compute power 
intensifies. Following in the footsteps of its colleagues in Frankfurt, Germany, the NY 
office implemented Platform’s workload management solution, Platform LSF. This 
allowed Deutsche Bank to create a virtual m a inframe from its existing cluster of 
computers, and eliminated the need to purchase additional hardware to address their 
demanding computing needs. 

European Aeronautic, Defense and Space Company (EADS) is Europe’s largest 
aerospace company, resulting from the combination of Aerospatiale Matra SA and 
DaimlerChrysler Aerospace AG (DASA). Prior to the merger between the two 
companies, DASA’s computing demands had traditionally been met by central 
mainframes, with some peak requirements satisfied by external supercomputers. Over a 
period of time, this environment had been replaced by distributed workstation and server 
systems. In an effort to replicate the easy-to-use, centralized mainframe environment, 
DASA adopted Platform’s workload management solution, Platform LSF MultiCluster, 
to manage their computing workload and distribute batch jobs across the network to the 
most suitable computers. 

GriPhyN: Communities of thousands of scientists, distributed globally and served 
by networks of varying bandwidths, need to extract small signals from enormous 
backgrounds via computationally demanding analyses of datasets that will grow from the 
100 Terabyte to the 100 Petabyte scale over the next decade. The computing and storage 
resources required will be distributed, for both technical and strategic reasons, across 
national centers, regional centers, university computing centers, and individual desktops 
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Distributed Computing Customer 
Examples 

Scientific: European Data Grid (CERN) 

- Thousands of physicists analyzing petabytes of 
distributed elementary particle data 

Aerospace: DaimlerChrysler Aerospace 

- Use distributed servers to perform complex 
simulations 


ORACLE 


Figure 39 


Example Grid Application: Data Grids for High Energy Physics 



~100 MBytes/sec 


There is a "bunch crossing " every 25 nsecs. 
There are 100 "triggers " per second 
Each triggered event is ~1 MByte in size 



TIPS is approximately 25,000 
Speclnt95 equivalents 


622 Mbits/sec 
or Air Freight (deprecated) 


J Wi VC Germany Regional j jH 
Centre 

Italy Regional ] 

Centre 

Fermi Lab ~4 TIPS 







Caltech )Tier2 Centre 

^Centre j Centre N 

| Centre 


~1 TIPS ^ ~1 TIPS ^ 

TIPS ^ATLVS 

TIPS ^ 


Figure 40 
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Example Grid Application: Stanford Linear Accelerator 



Figure 41 


Business Transaction Grid 

High-end Transaction Processing Systems 

• Example 

• Stock Trading system 

• Many feeds from other systems or exchanges. Head- 
end based approach to throughput, much like a 
concentrator 

• Scalable 

• Decentralized Services 

• Plug-in more capacity to cope with spikes in demand 

• QoS guarantees required 

• Real-time or near real-time execution 

* Non-repudiation 


ORACLE 


Figure 42 
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Oracle Grid Features 

• High Availability 

- Protection from failures, disasters, and human errors 

- 24x7 operation with online maintenance 

• End-to-End Grid Security 

- Authentication with SSO/ PKI, Kerberos, and RADIUS 

- Enterprise level authorization and delegation with Enterprise 
User Security 

- Secure transport via SSL 

• Portability between Grid phases 


ORACL€ 


Figure 43 


Oracle Grid Features 

• Manageability 

- Centralized security credential and user management via 
LDAP 

- Resource Management to enforce fair allocation of 
database resources 

- Automatic memory SQL execution memory tuning 

- Memory tuning advisors 

- Automatic storage management 

- Enterprise Manager GUI manages complete enterprise 
stack 


ORACLe 


Figure 44 
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ORACLE GRID FEATURES 


Transportable tablespaces: Add tablespace to a database and begin processing. 
Similar to tape racks on IBM mainframes: The database is a tape rack, and the tapes are 
databases. 


Oracle Grid Features 

• Information Sharing with Oracle Streams 

- Unites all information into a single stream 

- Flexibly routes, notifies, applies, interoperates 

- Share information between nodes, between blades 

- Exchange messages, replicate objects 

• Distributed (Federated) database access 

- Distributed query and DML with location transparency 

- Intelligent optimization of distributed execution plans 

- Full heterogeneous support including automatic distributed 
transaction management 


ORACLE 


Figure 45 


Oracle Grid Features 

• Transportable Tablespaces 

- Plug and unplug physical database files without 
loading and unloading tables 

• Real Application Clusters (RAC) 

- Naturally suited to the Grid — Expensive SMP 
servers not required for databases 

- Allows Oracle databases to dynamically grow 
and shrink on a blade farm Grid 

- Provides unlimited scalability and high 
availability for any application 


ORACLE 


Figure 46 
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RAC ARCHITECTURE ADVANTAGE 


Early 90’ s, everyone wrote off shared disk except oracle. Oracle persisted and 
now owns this space: 45 patents. Competitiors can’t match. Shared disk matches 
current trend toward network storage (SAN, NAS) while SN matches limitations of disk 
storage connectivity from the 80’ s. Runs real app’s : the proof is our customers like UPS 
(2 x 36 cpu), FAA 5 node Linux, Travelocity, . . . and app’s like sap, oracle 


RAC Architecture advantage 


Shared disk 

Add blade 


Remove idle blade 


Run any application 


Shared nothing 

Add blade and repartition 

Remove idle blade and repartition 

Fail when blade fails 

Run parallel query and 
benchmarks only 



Figure 47 
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DATASYNAPSE - DO MORE WITH LESS 


James Bemardin 
DataSynapse, Inc. 
New York, NY 
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•A# Data 
v.v Synapse 

Do More With Less 

LiveCluster™: Self-managed, guaranteed 
distributed computing platform 

Presented By: 

Jamie Bernard in 
Chief Technology Officer 

March 18, 2003 



N Data "Software Product Djclr o “Tech Winner’s 

••/Synapse of the Year” awards ° Circle” 
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Agenda 


HPC Challenges and DataSynapse Vision 

Traditional Approaches to Distributed Computing 

DataSynapse Customers Benefits and Value Proposition 

DataSynapse Solution 

LiveCluster Functional Components 

LiveCluster Application Focus 

Corporate Background & Industry Validation 


Data 

v v ,v Synapse 
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Strategic Challenges 




urgan 
Challenges 


> Migrate to lower cost 
(Intel) based architecture 

> Share resources across 
departments 

> Reduce future hardware 
purchase requirements 

> Increase productivity 


Department 

Challenges 


> Accelerate time-to- 
production 

> Decouple applications from 
underlying infrastructure 

> Add capacity-on-demand 

> Stop building and 
maintaining infrastructure 



/a. Data 
vV Synapse 
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Figure 3 


DataSynapse Vision 


> 

> 


Provide Infrastructure for Faster, Cheaper Computing Systems 

- Move customers out of the infrastructure business 

■ Provide a scalable, on-deman#opmp‘jte infrastructure 

■ Allow sharing of compute resources 

- Combine legacy and new hardware seamlessly 

- Simplify management and operation 

Maintain Application Focus 

■ Let customers focus on the application 

■ Allow in-process^ |ow latency distributed computing 

■ Advance from batch and job scheduling modes 

■ Provide data-cachingi,and state management solutions to enable mor 
applications to be hosted on grids 


•"i.Data 
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Figure 4 
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DataSynapse Vision 


Agility, resilience, 
scalability and 
interoperability 




Simplified, unified, 
fault-tolerant application 
operating environment 


Applications 




Middleware 


LiveCluster 


Systems 

Management 


Standardization, 
commoditization, 
better utilization 


Hardware 




Self-managed, adaptive, 
on-demand computing 
grid 


Data 

Synapse 
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Traditional Approaches to 
Distributed Computing 

EMMessage passing APIs (MPI, PVM) 

BtaD Queue-based messaging products (MQ Series) 
iDJob schedulers (PBS, LSF) 
pDShell scripting (rsh, ssh) 

SaD Distributed object technology (CORBA) 

March 18, 2003 


Figure 6 
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Selected Customer Case Studies 


> 

> 

Wachovia Securities Capital Markets 

■ Combine 200+ desktops, Sun EiOK, and Intel 8-ways 

■ Application integration (Java) in one person-week 


> 

Bank of America Global Risk Management 

■ ,000+ processors 

■ Share compute resources across departments 


> 

Two Top Petroleum Super-Majors 

- Utilize thousands of state of the art desktops 

■ Run terabyte 'migration/imaging processing apps 

■ Manage heterogeneous clusters 





•'i.Data 

Synapse 
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Where We Help Our Customers Today 

Commodity Computing 

Up to lOx TCO advantage vs. SMPs 

Server Consolidation 

Increase utilization to 100% 

Lowered Application TCO 

Out-of-the-box solution, reduced overhead 
Redirect IT staff from maintenance to development 

SLA Requirements 

Exceed application uptime requirements 

Rapid Application 
Deployment 

Reduce time-to-deploy to days from months 

On-Demand Computing 

Move from constrained, tightly coupled environments to 
unlimited scalability over heterogeneous resources 

Managed Virtual 
Application Environment 

Support legacy and next generation applications across 
platforms within a single grid operating environment 

.-fiVData 

Synapse 

March 18, 2003 
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Figure 8 
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DataSynapse Value Proposition 

LiveCluster enables applications requiring 
scalability to create a virtual environment that 
transforms IT resources into an on-demand easily 
managed grid computing infrastructure 

Dramatically reduce IT costand application TCO 
Increase application / business performance 
Guarantee application uptime and resilience 
Accelerate time-to-deploy for production systems 
Enable commodity compute models 



/••Data 

Synapse 
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Figure 9 


Value to Customers 




WACHOVIA 


“Before implementing DataSynapse’s LiveCluster solution, running ourP&L 
and risk reports could take as long as 15 hours overnight- 

turnaround our mission-critical reports in minutes, on a real-time 
intraday basis.' 1 

“Moreover, our group is trading 4x more volume and we have increased our 
modeling simulations by 25x - about a 1 GOx magnitude performance 

ifforti on the LiveCluster software platform 

“We will trade over $1 billion of fixed income and related capital markets 
products over DataSynapse this year- we are booking larger, more exotic, and 
more lucrative trades with more accurate risk-taking - ;:apse helps us 

make more money , period 

lt We haven't scratched the surface yet for how we envisage using 
DataSynapse to meet our ongoing product development and trading activity 
□ - Andy Cook, Head 

Exotics Trader 


•"fife .Data 

v v .v Synapse 
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DataSynapse Solution 


LiveCluster: Self-managed, guaranteed 

distributed computing platform 


Aggregate resources (clusters, desktops, big-iron) 

Allow and promote sharing of compute facilities 

Application focus (many ways to integrate) 

Ease-of-use allows acceleration of deployment 
and simplified management 



/a. Data 
Synapse 
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Figure 11 


LiveCluster: A Product Solution 


Traditional Solutions & Approaches 

■ High development and support costs 

- Limited resilience, scalability or administration 

- Limited support for parallel computing 

■ Batch and job scheduling only 

■ Difficult to share resources across departments 


LiveCluster Grid Computing Infrastructure 


Complete application-centric product solution 
that overcomes traditional constraints 


March 18, 2003 


Figure 12 
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GridServer Modules Virtualize the IT 
Environment into an On-Demand Grid 


LiveCluster Functional Overview 


Application . Job/Task Cluster Packaged 

Enablers arame nc API Suite Services Interfaces 



LiveCluster 

GridServer™ 


' Data 
Synapse 


Application Virtualization 


Scheduling State 

Management Management 


Enterprise Integration 


Sys Mgmt 
Connectivity 


Extensible 

Plug-ins 



Figure 13 


Application Virtualization 

> Services scheduling is adaptive and guaranteed 
>Many ways to integrate application components 

> Application state is managed by system 

> Heartbeat, activation, fail-over services 
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Figure 14 
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Utility Computing 

>Many methods for sharing and provisioning grid 
resources 

> Adaptive scheduling allows for automated 
scalability 

> Utilization statistics and charting (charge-back) 

> Audit trail and real-time diagnostics 

> Service-based architectures 



/a. Data 
Synapse 
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Figure 15 


Enterprise Integration 

> Interface to existing enterprise infrastructure 

> Extensible event and command processing 
>XML-based workflow/batch facility 

> Security plug-ins 

>Open APIs for integration with other 
management systems 



•'••Data 
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Figure 16 
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DataSynapse Focus on Applications Through 
a Virtual Managed Framework 


Applications 


r 

Batch- 

Object- 

Service- 

Third 

Oriented 

L 

Oriented 

Oriented 

Party 

j 


Decreased HI Increased Speed, 

Maintenance Reliability 


Application Enablers 


LiveCI uster 


GridServer™ 


Increased i — i Decreased 

Utilization TCO 



Figure 17 


Support for Legacy (Batch-Oriented) 
Applications 

Support for batch processing and MPI-compatible 
scheduling 

Re-use existing executables, especially suitable 
for rapid integration 

Support for parametric parallelism 

Leverage adaptive load balancing for optimal 
utilization 

Self-managed environment increases productivity 


_ March 18, 2003 

•'•'.Data is 
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Figure 18 
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Support for Object-Oriented 
Applications 

Support for interactive, GUI-based (e.g. compute- 
and data-intensive) applications 

In-process capability to divide large workloads to 
perform work in parallel 

Based on high-level APIs and clearly defined, 
distributed object abstractions 

Can integrate legacy and next generation 
applications within days 

Improves application performance and guarantees 
application execution 



•'i.Data 
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Figure 19 


Support for Service-Oriented 
Applications 

Provides resilient, scalable architecture for web 
services 

Suitable for parallel processing and load sharing 

Loosely coupled, easy integration method to grid 
enable application components 

Offers explicit support for stateful processing 

Uniform cross-language support with simple 
object-oriented APIs 

Requires no language binding with client 
applications 


March 18, 2003 


Figure 20 
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Corporate Background 


> Management and Foundation: 

■ Founders ex-Wall Street (physicists with NASA heritage) 

> Investors: 

■ Bain Capital, Intel Capital, Wachovia Strategic Ventures 

> Industry Focus: 

■ Finance - Energy - Government 

> Strategic Partners: 

■ Oracle - IBM - Sun - Intel - HP 


> Offices: 

■ New York - London; 

■ Washington, DC - San Francisco -Houston - Chicago 



Figure 21 


Partners Selecting Data Synapse 

isvs = mm* 

Calypso 

m * ' — auk 

(S) Milliman usa 1 

Infrastructure; 

inUl. 

±i 

"Grid" Computing Initiatives 

Distributed Computing 

Utility Data Center 

fySrn 

egenera 

Grid Computing 
N1 Initiative 

Process Area Network 
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Intel Selected DataSynapse as its 
Exclusive Distributed Computing Partner 


INSIDE 



IBM 2a 

DATASYNAPSE 3a 
BEA SYSTEMS, INC. 4a 
MACROMEDIA 5a 
VERITAS 6a 
BMC SOFTWARE 6a 
BORLAND 7a 
COMPUTE R ASSOC j flp 


INFRASTRUCTURE 


Of 


Business 


With today’s shaky economic situation and geo-political uncertainties, 
unpredictable charge Is one of the few oertainties In businesses. How 
can companies adjust to, and even take advantage of change, rapidly, 
without disrupting critical operations? Success requires rapid and 
graceful adaptation — adaptation of business prooesses along with 


SYluAPS 


source of sustainable competitive advantage. » 


£ 

I THE CHANCING I NFRASTRUCTURE OF 


U S I N E S S 


Data 
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Industry Analysts Recognize DataSynapse 


Aberdeen^ itoup 


“By focusing on the applications themselves rather than the 
traditional approach of focusing on resources, 

transforms applications' performance, reliability and 
resiliency, and cost of ownership^” 

“DataSynapse is leading the way in providing commercial 
solutions for application reliability and resiliency with its 
distributed computing solution, LiveCluster.” 


“IDC views the DataSynapse product as a solution for 
organizations with a variety of IT assets from clusters and 
hallway grids to campus grids and intraprise grids.” 


=-=Bloor 

= Research 


“DataSynapse knows how to easily integrate a guaranteed 
distributed computing solution into customers’ legacy and 
new applications to exploit an underlying grid environment.” 

“DataSynapse is the only vendor...that approaches grid 
computing from a 'commercial* perspective.” 


Data 

'V- Synapse 
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DataSynapse and Grid Industry Standards 

LiveCluster is built using accepted industry standards (.NET, 
Web Services, J2EE) 

Active members of OGSA-WG and is implementing many 
relevant OGSA services 

LiveCluster is designed to compatible with emerging grid 
standards 



•'i.Data 
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Figure 25 


Conclusion 


> Provide Infrastructure for Faster, Cheaper Computing Systems 

■ Move customers out of the infed structure business 

■ Provide a scalable compute infrastructure 

■ Allow sharing of compute resources 

■ Combine legacy and new hardware seamlessly 
- Simplify management and operation 


> Maintain Application Focus 

■ Let customers fpjcus on the application 

■ Allow in-process, low latency distributed computing 

■ Advance from batch and job scheduling modes 

■ Provide data-cach ingrand state management solutions to enable more 
applications to be hosted on grids 


•'flfe.Data 
v.v Synapse 
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STAR BRIDGE SYSTEMS 


Jim Yardley 
Star Bridge Systems 
Midvale, Utah 
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Star Bridge Systems 
Who Are We? 


Star Bridge Systems® 

A Pioneer in the field of reconfigurable 
computing. 

Established in 1998 in Midvale, Utah, 

Developer of highly innovative hardware and 
software turning FPGAs into inherently 
parallel, general-purpose, reconfigurable 
supercomputers. 


Figure 1 



Figure 2 
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Zion National Park 


■ Sandstone Monuments towering hundreds of 
feet over the Virgin River 


■ Virgin River 

■ Head waters 10K feet 

■ Zion Park 2000 feet 


■ Transition of the Virgin River 

■ High mountains to Sandstone monuments 

■ Narrow canyons through which the water passes 


Figure 3 



Figure 4 
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Zion National Park 

■ Zion Narrows 


■ 17 mile hike 

■ Wall to wall water 

■ Sandstone cliffs several hundred feet 
straight up 

■ Vegetation 

■ Trees and bushes growing out of the 
sandstone walls 


Figure 5 



Figure 6 
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The T ree 


■ Seed found soil/nourishment 

■ Roots overgrew their support 

■ Tree died 

■ "Dead Tree Syndrome" 


Figure 8 
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Parallel Reconfigurable 
Computing 

■ The shortest distance from thought to 
solution 


■ Eliminates the "Dead Tree" Syndrome in 
High Performance Computing 


Figure 9 



High Performance Computers 

■ FFT on a PDP8E 


■ Research for NASA 

■ Upper atmosphere 
■ Serial Processors 


. 8080 / 6800 /MP wars 
■ Fairchild decimal processor chip set 


Figure 10 
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IBM PCs/Clone 

■ Operating Systems/Databases 

■ Microsoft 

■ Oracle 


■ Cluster computers 

■ Specialized Hardware 

■ Parallel computing 

■ COTS— Commercial off the shelf 
Systems 


Figure 1 1 



Figure 12 
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ASICS 

■ Cost $5M to $40 M 

■ Requires a very large market 


Figure 13 



FPGAs 

■ How to program 

■ Cost to program 

■ Full data set implementation 


Figure 14 
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Figure 15 


So what does this mean? 


Soaring without limits 
Have we reached our 
When will we go 
for the ride again 


limits 




Figure 16 
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“Re- configurable Computing”, 


A phrase coined by Kent Gilson, refers to "the frequent remanufacture 
or morphing of the entire physical hardware, according to the demands 
of the user's specific behavioral requirements". 


With re-configurable 
computing, you don't 
ste a lot of time moving 
d out of memory, 
se all operations are 
rmed on hardware, 
makes things move 
quickly. Kent Gilson 
rs to this as hvoer- 
computinq . 



Figure 17 


The Potential For Efficient 



Computing 


is Greater 


Obtaining parallelism in processing would be a gigantic leap in 
programming, because it more closely depicts how things happen in 
the real world. 



Figure 18 


381 


The Potential For Efficient 



Computing 


is Greater 


FPGA's are also produced much faster than the standard ASIC 
chip, (Application Specific Integrated Circuit), which makes 
them a good choice for the future of hyper-computing, (i.e. it 
can take a standard ASIC chip up to as long as 18 months to 
be produced.) 



(Decrease production time + Increase in 
flexibility + Multi-processing) = 

Cheaper, Faster, More Efficient Computing 


Figure 19 


The Need for Faster Machines 


■ Problems Dependent on Computation and Manipulation of 
Large Amounts of Data 

■ Image and Signal Processing 

■ Entertainment (Image Rendering) 

■ Database and Data Mining 

■ Seismic 

■ Grand Challenge Problems: 

■ Climate Modeling 

■ Fluid Turbulence 

■ Pollution Dispersion 

■ Human Genome 

■ Ocean Circulation 

■ Quantum Chromodynamics 

■ Semiconductor Modeling 

■ Superconductor Modeling 

■ Combustion Systems 

■ Vision & Cognition 



Figure 20 
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Parallel Computing 

■ Why? - — Need for Speed 

. What? ----- 


■ Clusters 
. ASICS 
. FPGAs 


■ Heterogeneous 
■ When? - — Now 


■ How? I'm going to show you 


Figure 21 



Parallel Processing 

■ Traditional computers- Serial Processing 

■ HAL hypercomputer- Parallel Processing 

Parallel Processing- the ability to 
execute numerous task simultaneously 


■ Possible because of FPGAs 


Figure 22 
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Figure 23 



Figure 24 
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Star Bridge Systems 
Fundamental Technology Shift 


“This technology will allow us to translate our 
ideas into solutions ...as fast as we can think” 

Dr. Robert Singleterry, NASA Research Scientist 



High Level Language 


True Parallel Hardware 


• High level graphical language 

• Directly programs FPGAs 

• Reusable highly flexible objects 


Reconfigurable hardware 
Unique FPGA Implementation 
Perfect computing every time 


Figure 26 
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Speed Through Superspecificity 


■ Microprocessors are usually designed with 
generalized functions to address a wide 
variety of algorithmic applications. 

■ Star Bridge technology creates only the 
necessary and sufficient circuitry needed for 
the specific application. 

■ FPGA core development - l/16 th to l/20 th the 
size of VHDL designed cores. 


Viva® Brings HyperComputing ® to Life 



Figure 27 



Your custom Hardware or 
* SBS HyperComputers ® 


t 


Rapid Application Development 


FPGAs 


Architecture on Demand 
or “Circuitry” 


\ 



Re-configurable 


Perfect Computing 
Every Time ! 


Full 

Utilization 


on 


Thousands of 
Operations per 
Clock Cycle 


Small Footprint 


Super-Specificity 


Low Power Use 


Figure 28 
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Figure 29 


“Our mandate is to pioneer the future ... to push the 
envelope ... to do what has never been done before.” 

1 NASA Vision bv Administrator Sean O’Keefe, April 12, 2002 , Maxwell School at 

| Syracuse University 


■ ILLIAC 

(Ames-1972) 

■ Finite Element Machine 

(Langley - 1982) 

■ MPP 

(Goddard-1983) 

■ Cray 

(Langley-1989) 

■ Intel 

(Delta -1992) 

■ IBM 

(LaRC-1995) 

■ Star Bridge Systems Hall 5 (Langley - 2002) 

■ Star Bridge Systems HC36 (Langley- 2003) 


Figure 30 
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Traditional Sequential Programming 


■ Fortran and C programs have been 
written for serial computers: 

■ One instruction executed at a time 


■ Using one processor or clusters 

■ Processing speed dependent on how fast 
data can serially move through hardware 
and subsystems 


■ Subsystem communications bottlenecks 


Figure 3 1 



How to Program 
Reconfigurable Computers? 

■ Data Flow Programming Style 

■ Must Think "Inherently Parallel" 

■ Graphical Based "Language" 

■ Everything Tied to Clock Signals 

■ No Von Neumann Bottlenecks 

■ Programming Power Tied to Number of 
Gates or Area (Number) of FPGAs 


Figure 32 
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Hal-15 


■ Present Product 

. Hal-15 

. 10 FPGAs/Board 

■ 20 billion MAC(16-bit multiply accumulates/sec) 

■ 5 billion FLOPS (32-bit Floating Point Operations/sec) 

■ 500 Giga OPSS (4-bit, integer operations/sec) 

■ Configurable Options: 

■ 2-10 FPGAs per Board, up to 20 boards per system 

■ Viva 1.5 Release 



Figure 33 



The 3-Points of HAL: 

Hyper Algorithmic Logic Computer 


• The HAL 15 system here at NASA was the first system to be 
delivered by Star Bridge to an established high performance 
computer user. The HAL 15 uses a combination of an Intel based 
workstation, and a PCI board containing 10 Xilinx FPGA chips. 


• IIADL (Implementation Independent Algorithm Description 
Language) a new programming language that makes it possible 
for an FPGA-based re-configurable computer to operate as a 
general-purpose computer system. 


• Viva (Latin word for "life"), brings life to HAL and hyper- 
computing as an OS, compiler, and graphical user interface all in 
one. 


Figure 34 
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HAL in NASA 


■ Spacecraft and Satellite control centers 

■ Solutions for structural, electromagnetic and 
fluid analysis 

■ Radiation analysis for astronaut safety 

■ Atmospheric science analysis 

■ Digital signal processing 

■ Pattern recognition 

■ Acoustic analysis 


Figure 35 


Langley Algorithms Developed* 


> Matrix Algebra: Vectors, Matrices, Dot Product 

• Factorial => Probability: Combinations/Permutations AIRSC 

• Cordic => Transcendentals: sin, log, exp, cosh... 

^ Integration & Differentiation (numeric) 

^ Matrix Equation Solver: [A]{x} = {b} via Gauss & Jacobi 

• Dynamic Analysis: [M]{u} + [C]{u} + [K]{u} + NLT = {P(t)J 

• Analog Computing: digital implementation 

• Nonlinear Analysis: “Analog” simulation avoids NLT 

solution development time 

* In AIAA & Military & Aerospace Programmable Logic Device (MAPLD) papers 



Figure 36 


390 



Figure 37 
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VIVA: Gateware Development Tool 



: Simple tool to configure FPGAs (VHDL cumbersome) 
Transforms high-level graphical code to logic circuitry 
Achieve near-ASIC speed (w/o chip design) 


Growth in VIVA Capability 


VIVA 1 (Feb ’01) VIVA2 (July ’02) 


NO Floating Point 

Extensive Data Types 

NO Scientific Functions 

Trig, Logs, Transcendentals 

NO File Input/Output 

File Input/Output 

NO Vector-Matrix Support 

Vector-Matrix Support 

Access to One FPGA 

Access to Multiple FPGAs 

Primitive Documentation 

Extensive Documentation 

Weekly Changes 

Stable Development 

Frequent “bugs” 

Few “bugs” 


Figure 39 
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Year 2: Exploit Latest FPGAs 



Rapid Growth in FPGA Capability 


FPG A (Feb 01 ) FPGA (Aug 02) 


Xilinx FPGA 

XC4062 

XC2V6000 

Gates 

62K 

6 million (97x) 

Multiplies in H/W 

0 

144 (18x18) 

Clock Speed MHz 

100 

300 (3x) 

Memory 

20Kb 

3.5 Mb (175x) 

Memory Speed 

466 Gb/s 

5 Tb/s (11x) 

Reconfigure Time 

100ms 

40ms (2.5x) 

GFLOPS 

0.4 

47 (1 20x) 


Total GFLOPs 4 (io fpgas) 470 (io fpgas) 


Plans: 

- Millions of Matrix Equations for Structures, Electromagnetics & Acoustics 

- Rapid Static & Dynamic Structural Analyses 

- Cray Vector Computations in Weather Code (VT PhD) 

- Robert on Administrator’s Fellowship at Star Bridge Systems 

- Joint proposals with NSA & DARPA 

- Simulate advanced computing concepts using VIVA 

- Collaborate with SBS to expand VIVA libraries 

- Influence VIVA development to meet NASA application needs 

- Expand FPGA applications for NASA programs 


Figure 41 
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FPGA Performance Curve 



■ Microprocessor density is doubling only every 18 to 24 months 
under Moore's Law 

■ FPGAs are on a much steeper growth curve. 

■ In late 1999, Xilinx projected the following growth path for its 
most powerful FPGA chips: 

■ 1998 1 million gates per FPGA 

■ 1999 2 million gates per FPGA 

2000 4 million gates per FPGA 

2002 10 million gates per FPGA 

■ 2004 50 million gates per FPGA 


Figure 43 


Virtex-II - Distributed DSP Resources 


□ LUTs* & 
Registers 

- Up to 122,880 

- Logic + storage 

Usage examples 

- Pipelined algorithms 

- Multiple channels 

- Coefficient storage 

- Shift registers/delay 
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* LUT = Lookup Table 


□ Active Interconnect™ 

- Connects all resources 

■ Selectl/O™ 

- Up to 1,108 programmable I/Os 

www.xilinx.com 


■ 18x18 Multipliers 

- Up to 192 

- 200+ MHz 

Usage examples 

- High-performance FFT 

- Equalizers 


■ Block RAM 

- Up to 3.5 Mbit, true dual-port 

Usage examples 

- Data buffering + storage 

- Single-chip 2-8K FFT 

- Video line buffers 

7 JI XILINX 
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Reconfigurable Computing Software Enables Easy 
Programming Of Hardware For The US Air Force 




VIVA Software to Hardware 

"Traditional approaches to programming algorithms in FPGA are laborious and 
time consuming. Star Bridge Systems® of Midvale, Utah developed an 
Electronics Design Automation (EDA) tool called Viva® that solves these 
problems" 

Mr. Lloyd Reshard, AFRL/MNAV 




Figure 45 



Star Bridge Systems 


■ FPGA Technology 

■ Inherently Parallel Compute Substrate 


■ Reconfigurable Hardware 

■ How do you easily Program? Or Reconfigure??? 

■ Viva Software 

■ Hypercomputer 

■ Parallel Supercomputer Capacities and Capabilities 


Figure 46 
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Star Bridge Systems 

■ Reconfigurable Computing Technology 


■ Massively and inherently parallel 

■ Asymmetrical 

■ Ultra-tightly coupled 

■ Linearly scalable 


■ Realization of Parallel Computing 


Figure 47 



Star Bridge Systems, Inc. 


■ Hardware and Software Integrated 
System 

■ Parallel and Reconfigurable Computing 

■ Greater Programming Flexibility 

■ Lower Cost 

■ Lower Power Consumption 

■ Smaller Size 

■ Breakthrough in Performance and Speed 


Figure 48 
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List of Customers 


NSA 

George Washington University 
University of South Carolina 
George Mason University 
North Carolina A&T 


. NASA 

■ Langley 

■ Marshall Space Flight Center 

■ San Diego Supercomputer Center 

■ U.S. Air Force Eglin AFB 

■ National Cancer Institute— Bio-informatics 

■ Commercial Customers for EDA applications 


Figure 49 



What are People Doing With This? 


. NASA 


■ Seismic Petroleum Exploration 

■ Smiths-Aerospace 

■ National Cancer Institute 

■ US Air Force 


Figure 50 
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Hypercomputer® Configurations 


■ HC-36m Hypercomputer 

. PCI/X Bus Interface 133 MHz, 64 bit 

■ 2 Vertex II 4000 chips 
5 Vertex II 6000 chips 

10 G Byte Ram with 20 64-bit parallel memory channels 
225 B Bits/sec Inter chip communications 

■ 200 plus External I/O pins 

■ HC-62m Hypercomputer 

. PCI/X Bus Interface 133 MHz, 64 bit 

■ 2 Vertex II 4000 chips 
9 Vertex II 6000 chips 

18 G Byte Ram with 36 64-bit parallel memory channels 
450 G Bits/sec Inter chip communications 

■ 500 plus External I/O pins 

■ HC-98m Hypercomputer 

Contains one HC-62m and one HC-36m 


Figure 5 1 



New Products 

• HC-36 and HC-62 


■ 11 Vertex II FPGAs/Board 62 million total 
gates 

■ 1 trillion DSP MACs/second 

. 40 billion 64-bit FMULs (64 bit FP Multiplies) 

■ SDRAM 18 Gigabytes 

■ 36 channels by 64 bits memory 

■ Bandwidth 90 Gigabytes/sec 

■ 3 Megabytes Cache Memory 

■ Bandwidth 2.5 trillion bytes/second 

■ 672 Gigabits/second Aggregate Inter- 
processor Communications bandwidth 

■ Expandable to multiple boards per system 


Figure 52 
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Figure 53 


HC-62 Board Set Assembly 



Figure 54 
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Figure 55 



Star Bridge Systems 

■ Technical Discussion 


■ FPGA Hardware configuration 
■ Viva Discussion 


■ Viva Demonstration 

■ Bio-Informatics System Application 

■ Smith Waterman 

■ Air Force 1553 Standard 

■ Programming Examples 


Figure 56 
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.5 GBytes 
DDR RAM 



FPGA PE Structure 
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Hypercomputer System 
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Star Bridge Systems, Inc. 


■ Viva — Capability Computing Software 


■ Programming Language 

■ Compiler/Operating System 

■ Graphical User Interface 

■ Integrates Hardware and Software 

■ Software Implementation at Hardware Speeds 

■ Increased Productivity of Application Developers 

■ Solves Difficultly in FPGA Program Development 


Figure 60 
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VIVA software 


■ What: Graphical Programming 
Language 


How: Transforms high-level graphical 
code to logical circuitry ^ ^ t 

Why: Achieves near ASIC spee 



Figure 61 



Viva- Primary Elements 


■ Rapid Application Development Environment 

■ Parallel Component Object Oriented Language 

■ IIADL— Implementation Independent Algorithm Description 
Language 

■ Execution Target/System Definition Tools 

■ Multi-Process Execution and Reconfiguration OS-Kernel 

■ Application Builder Libraries METALIB 

■ System Target Libraries 

■ User Interface/STDIO COM/ ActiveX Component Library 


Figure 62 
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User Library 


System Library 

(Behavior Atoms) 


Compiled Behavior (reduced to atoms) 



Virtual System 



System Description 


Other Hardware 
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Rapid Application 
Development 

■ COM/ ActiveX Form Designer 

■ Drag n Drop Graphical Program Editor 

■ Drag n Drop Data Set Editor 

■ Drag n Drop System Builder and 
Resource Editor 

■ Auto Generated Widget Interface 


Figure 64 
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u 

IIADL and Compiler 

m m 

Data Flow Centric Programming Model 

■ 

Parallel Component Object Oriented Language 

■ 

Recursive Algorithm/Topology 

■ 

Unlimited Operator Overloading 

■ 

Data Set Polymorphism 
■ Multi-Precise and Multi-type Operators 

■ 

Information Rate Polymorphism 
■ Multi-Rate Operators 

■ 

Context Sensitive Operator Synthesis 

■ 

Strong Types 

■ 

Data Set Composition/Decomposition Operators 

■ 

Dynamic Data Set Creation 

■ 

Timing Driven Partitioninq/Co-svnthesis 


Figure 65 
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Operating System Kernel 

■ Event Driven Real-time Mixed-Mode Execution Kernel 


■ Multi-process Hierarchical Thread creation and 
lifetime management operators 

■ Dynamic default Interface Creation and Execution 

■ Inter-process communication and processor side 
memory management 

■ Full COM(Common Object Model) Execution 

■ Stand Alone Executable Creation for Application 
Distribution 


■ Dynamic Reconfiguration Support 


Figure 67 
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Libraries 


■ Application Builder Libraries MetaLIB 

■ Math 


■ Memory 
. I/O 

■ Control 

■ Logic Structure 

■ Image Processing 

■ Signal Processing 

■ Data Compression 


Figure 69 



Libraries 

■ System Target Libraries 


■ Emulation 


■ Single Symmetric 

. 1 FPGA 

■ Full Resource 

■ Multiple FPGAs 

■ Distributed over Multiple Boards or Systems 


Figure 70 
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Libraries 


■ User Interface Component Libraries 
Com/ ActiveX 
. File I/O 


■ Strings 

■ Memory Management 

■ Data I/O 


Figure 7 1 



Star Bridge Systems 

■ Bio-Informatics Application 


■ Smith Waterm 


■ Air Force 1553 Interface Protocol 

■ Porting Viva to other FPGA systems 

■ Programming Hints 


Figure 72 
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Smith Waterman Algorithm 


■ Search databases for sequences similar 
to a query sequence 


■ Dynamic programming to determine an 
optimal alignment 


■ Score is assigned for each character-to- 
character comparison 


■ Used to determine the position of 
matches 


Figure 73 



Smith-Waterman Algorithm 


■ The Smith-Waterman algorithm compares 
segments of all possible lengths (LOCAL 
alignments) and chooses whichever to 
maximise the similarity measure 


■ For every cell the algorithm calculates ALL 
possible paths leading to it. These paths can 
be of any length and can contain insertions 
and deletions 


Figure 74 
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Smith-Waterman Algorithm 


Only works effectively 

^pvhen gap penalties are 
used 

■ In example shown 

■ match = +1 

■ mismatch = -1/3 

■ gap = -l+l/3k (k=extent 
of gap) 

■ Start with all cell values = 

0 

■ Looks in subcolumn and 
subrow shown and in 
direct diagonal for a score 
that is the highest when 
you take alignment score 
or gap penalty into 

aca^ptm ax {//,_, +s(a i ,b j ), max { H t _ kj -W k }, ma x{H t j4 -W,), 0} 
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mith-Waterman Algorithm 


■ Four possible ways of forming a path 
For every residue in the query sequence 

1. Align with next residue of db sequence ... score is 
previous score plus similarity score for the two 
residues 


2. Deletion (i.e. match residue of query with a gap) 
... score is previous score minus gap penalty 
dependent on size of gap 

3. Insertion (i.e. match residue of db sequence with 
a gap) ... score is previous score minus gap 
penalty dependent on size of gap 

4. Stop ... score is zero 


Figure 76 
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mith-Waterman Algorithm 


Construct Alignment 


is the maximum 
possible score for an 
alignment of ANY 
LENGTH ending at 
those coordinates 
Trace pathway back 
from highest scoring 
cell 

This cell can be 
anywhere in the arra 
Align highest scoring 
segment GCC _ UCG 
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Smith-Waterman 

■ Smith-Waterman 


1. Local alignments 

2. Residue alignment score may be 
positive or negative 

3. Requires a gap penalty to work 
effectively 

4. Score can increase, decrease or stay 
level between two cells of a pathway 


Figure 78 
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Bio-Informatics Dilemma 


■ Sun Microsystems, Time Logic 

■ Hardware Accelerators 

■ 372,119 X 19,192 Comparisons 

■ 41 hours 46 minutes 

■ 1/10 of time without an accelerator 

■ Linux Cluster 


■ 144 days of uninterrupted processing time 

■ Star Bridge Solution 

■ Less than a second 


Figure 79 



Star Bridge Systems, Inc. 
Smith Waterman Prototype 



Figure 80 
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Main Smith Waterman 
Modules 


■ SW_Xpoint - FPGA which accesses PCI 
and User Interface busses. It gathers 
the data and presents it to SW_PE01, 
where calculations will take place. 


. SW_PE01- FPGA where Smith 
Waterman iterations are actually 
implemented. 


Figure 8 1 
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SW Program Flow 


■ As the query sequence is loaded, the Init_Cells object 
creates our initial column and stores it in 
SW_Cell_Mem. 


■ After this initialization period, SW_Cell_Mem will 
provide a cell to the SW_Iteration object every clock 
cycle. It will also write a newly calculated cell every 
clock cycle. 

■ The SW_Cell_Mem object stores every n th column, 
where n is the number of SW_Iteration objects. In 
order to avoid the expense of a dual port RAM, every 
address contains two cells, so that two may be read 
one clock cycle, and two may be written the next. 


Figure 84 
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Smith Waterman Cells 


■ There are as many cells as there are 
characters in the query sequence. 


■ The array of cells represent a column of 
the scoring matrix. 


■ The initial (zero) column is initialized 
and stored into the cell memory object, 
SW Cell Mem. 


Figure 85 




Cell Contents 

■ Each cell contains the following four 
parameters: 

■ Pattern - a character from the query 
sequence 

■ Score - the score of this cell in the current i,j 
position 

■ PatternStart - the position in the query 
sequence from which the score was 
calculated 

■ DataStart - the position in the reference 
sequence from which the score was 

/-nl/-i iln+a/H 


Figure 86 
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Pattern 


r CollectCell 

Pattern 


pEKposeCell 

1 Pattern 


Pattern 

Score 

£ 

IScore 


Score! 


Score 

FatternStart 


PatternStart 


PatternStart! 


PatternStart 

DataStart 

p = 

DataStart 


DataStart! 


DataStart 


The individual elements of a cell may be accessed 
and bundled using the CollectCell and ExposeCell 
objects. 


Figure 88 
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Cell Data Types 


■ The size of all data elements may be adjusted 
depending on usage: 


■ Pattern - contains as many bits as needed to encode 
characters from the sequences - 4 bits for genes. 


■ Score and PatternStart - must be the same size and 
be large enough to encode the number of entries in 
the query sequence 


■ DataStart - will be the largest data set as it must be 
able to encode any position in the reference 
sequence. 


Figure 89 



Size for the Job 


■ Because all the parameters are 
adjustable in their size, less circuitry is 
needed to calculate matches in smaller 
sequences. 


■ Smaller sequences may for this reason 
utilize more parallelism. 


Figure 90 
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In this example, our Pattern contains 4 bits, for 
modeling genes. The Score and PatternStart 
parameters contain 8 bits, so our query sequence may 
contain up to 256 characters. The DataStart parameter 
contains 13 bits, meaning our reference sequence may 
contain up to 8192 characters. 
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MSB013 
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Parallelism 


■ If a given hardware system has enough 
physical resources to accommodate n 
SW_Iteration objects, the Smith Waterman 
program may operate on n columns in 
parallel. 


■ Hence n cells are computed every clock cycle. 


■ Conservative estimates place 150 
SW_Iteration objects in each Virtex II 6000 


Figure 92 


418 



HC-62 


■ An HC 62 has the bandwidth to pass 
cells between 8 FPGAs, allowing for 720 
parallel SW_Iteration objects. 


■ At a conservative lOOMhz system clock 
speed, this gives 100,000,000 * 720 = 


■ 72 billion Smith Waterman steps/second 


Figure 93 
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1553 Bus Controller 


■ IP Library element 

■ Reduce costs 

■ Reduce number of parts 

■ Eliminate parts obsolescence problems 

■ Increase flexibility 

■ Multiple channels on a chip 

■ Programmability for usage 

■ Bus Controller, Remote Terminal, or Bus Monitor 


Figure 95 



1553 Protocol 


■ Polymorphic 

■ Overloading 

■ Recursion 

■ Synthesis time resolution 

■ Polymorphic 1553 

■ Command word 

■ Data word 

■ Status word 

■ Programmability in encoding, length of data field, 
construction of the sync waveform, or the speed 
the data is transmitted 


Figure 96 
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1553 Protocol 

■ Viva Implementation 


■ 150 CLBs Vs. 2800 logic elements from a 
competitor 

■ Short development time 


Figure 97 


Target Viva to XSV-800 


■ System Description 

■ FPGA System 

■ Clock System 

■ Parallel I/O System 

■ Parallel Input Behavioral Communications 
- Viva Port I/O Object 

■ Programming the FPGA 

■ Viva Programming 


Figure 98 
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Target Viva to XSV-800 


■ Programming the FPGA 

■ Implementation System 

■ Overloads board initializer program 

■ Viva spawns file progxsv.vex 

■ Calls a DOS application provided by board manufacturer 

■ Spawns Xsload, also provided by board manufacturer 

■ Viva Programming 

■ Program development 

■ EDIF file generation 

■ Xilinx place and route 

■ Spawns the progxsv.vex program 


Figure 99 
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Star Bridge Systems 
Hypercomputing 

■ Greatly Reduces Development Cycle 

■ Reduces Time to Market 

■ Parallel and Reconfigurable Computing 

■ Greater Programming Flexibility 

■ Lower Cost 

■ Lower Power Consumption 

■ Smaller Size 

■ Breakthrough in Performance and Speed 


Figure 101 



Recurive Examples 

■ Learn by taking apart examples and 
library elements. 

■ ADC example 

■ Have Fun Programming in Viva 


Figure 102 
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DARPA’S NEW COGNITIVE SYSTEMS VISION 


Zachary Lemnios 

DARP A/Information Processing Technology Office 
Arlington, VA 
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DARPA’s New Cognitive Systems Vision 


Mr. Zachary J. Lemnios 
Deputy Director 

Information Processing Technology Office 


18 March 2002 


Emerging and Future Computing Paradigms and their Impact on the 
Research, Training and Design Environments of the Aerospace Workforce 


Figure 1 
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DARPA and the 
Computing Revolution 


The information technology revolution of the second half of the 
20 th century was largely driven by DARPA/IPTO 
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A NEW CLASS OF AUTONOMOUS SYSTEMS 


This chart illustrates the challenge for autonomous systems, to provide increasing 
performance to enable new capabilities in environments of increasing complexity. 
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THE CHALLENGE OF COMPLEXITY 


The major focus for the newly formed IPTO is the development of cognitive 
systems. While computational performance has been increasing due to Moore’s Law, the 
productivity and effectiveness of these systems are not increasing at the same rates. 
Development of cognitive systems will enable systems to become more usable; more 
flexible in application; less vulnerable to attacks and more robust in detecting and 
recovering; all while remaining cost-effective. Cognitive systems are enabled by a firm 
foundation in the underlying science and mathematics in algorithms and information 
assurance, which is then embodied in robust software and executed on advanced 
hardware. Future IPTO programs will also focus on the critical aspects of autonomous 
perception; knowledge representation and reasoning; machine learning; and 
communications and interactions. 


The Challenge of Complexity 



While computational performance is increasing, productivity and effectiveness are 
not keeping up: 

- Users must adapt to system interfaces, rather than vice versa 

- Systems have become more rigid and more fragile 

- Systems have become increasingly vulnerable to attack 

- The cost of building and maintaining systems is growing out of control 
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DARPA/IPTO COGNITIVE SYSTEMS 


The major focus for the newly formed IPTO is the development of cognitive 
systems. While computational performance has been increasing due to Moore’s Law, the 
productivity and effectiveness of these systems are not increasing at the same rates. 
Development of cognitive systems will enable systems to become more usable; more 
flexible in application; less vulnerable to attacks and more robust in detecting and 
recovering; all while remaining cost-effective. Cognitive systems are enabled by a firm 
foundation in the underlying science and mathematics in algorithms and information 
assurance, which is then embodied in robust software and executed on advanced 
hardware. Future IPTO programs will also focus on the critical aspects of autonomous 
perception; knowledge representation and reasoning; machine learning; and 
communications and interactions. 


— DARPA/IPTO — 
Cognitive Systems 


• DARPA IPTO will create a new generation of 
cognitive computational and information 
systems with capability to: 

- reason, using substantial amounts of 
appropriately represented knowledge 

- learn from their experience so that they perform 
better over time 

- explain themselves and be told what to do 

- be aware of their own capabilities and reflect on 
their own behavior 

- respond robustly to surprise 


Systems that know what they’re doing 
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WHY NOW? 


Cognitive technology (from AI) is working in bits and pieces, ranging from large- 
scale knowledge bases to machine learning in support of data mining 



Why Now? 




* Human-level scaling of HW technology is on the horizon 

* Foundations established for human neural systems 

* Cognitive technology (from AI) is being applied to initial problems 

Page 8 
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What $1000 Buys after Kurz*eil. 1999A Mcravec, 199S 


From Keynote address by Randall D. Isaac (VP Science and Technology, IBM Research) 0 
PACT 01 International Conference on Parallel Architectures and Computation Techniques, 
Sept 8-12, 2001, Barcelona, Spam, 
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MEMORY WALL IN GROWING 


Based on the SIA roadmap projections, as CMOS feature sizes decrease along 
with a clock frequency increase, the actual memory access times in clock cycles actual 
increase. Memory access time are also a factor of the memory size or capacity due to 
increased wiring delays. The memory wall only increases as memory is moved further 
and further away from the processor core. (Off chip or off the board) The trend to date 
has been to move memory on board the chip but there is only so much that can be 
accomplished as indicated by this graph. One solution is to develop Processor-in- 
Memory (PIMS) with many small memory tiles local to small processor cores. 



Memory Wall Is Growing 




Figure 11 
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SIA ROADMAP IMPACT ON COMPUTER ARCHITECTURES 


This is a vivid example of why computer architectures will not scale with CMOS 
feature size. The spatial extent of a clock (logic functions) has shrunk from an entire die 
to a very small region with in a die. New innovative architectures, packaging and 
interconnect are required to efficiently take advantage on the increased number of 
transistors per MM. Spatial locality is very critical. Note: Wire delays are not 

decreasing at the same rate as the transistor delays. 

SIA Roadmap - Another interesting trend from the physics of signal propagation 
is illustrated here. When clocks were slow, electrical signals could easily traverse across 
the chip in the time to settle between subsequent timing edges. As clock speeds have 
increased, the physical distance that a signal can travel and settle between subsequent 
timing edges has decreased. This trend will continue, even with the introduction of 
repeaters. It is expected that changes in circuit implementation, such as a greater 
exploitation of asynchronous communication will begin, and that these changes will also 
affect the performance of various new architectures. 


SIA Roadmap Impact 
on Computer Architectures 



400 mm 2 Die 


Single Clock Area 


o 

O 0.06 

lj§i 

m ws 

10 GHZ 

r £z 

0- 0.04 




IJUJ 

1 

0.02 




— >4- — 

n 





1996 1998 2000 2002 2004 2006 

2008 

2010 

1 — 

2012 


year 


13.5 GHz 


2014 


New architectures are required to accommodate smaller clock regions 


Figure 12 


435 


NOVEL ARCHIECTURES ARE NEEDED TO EXTEND PRODUCTIVITY 


To date the programming model has hidden the architectural (tricks) techniques 
used to achieve Moores Law. 

Conventional processor scaling or peak performance improvements are going to 
slow down due to limited improvement in the number of instructions per cycle and end of 
pipelining advantages (number of implemental gates between clock cycles). 

The realization of these techniques to-date have resulted in enormous increased 
complexity, chip design cost/schedule, and chip software costs. To the point where only 
a few large elite companies can now afford to develop new families of computers. 

The challenge is to now exploit and develop new solutions to make systems more 
productive! 

Over the past 20 years (and in fact since the first microprocessor in 1971), 
microprocessor performance has been improving at a rate of approximately 52% per year. 
This exponential increase in performance has come from three sources: (1) improvements 
in clock rates from innovations in silicon processing technology that have made 
transistors smaller and faster - 19%/year, (2) improvements in clock rates resulting from 
deepening pipelines and reducing the number of levels of logic per clock - 9%, and (3) 
improvements in microarchitectures including multi-instruction issue and out-of-order 
execution - 18%. Modern designs have nearly exhausted the benefits of pipelining, and 
conventional architectures are struggling to sustain even one instruction per cycle. 
Without further innovations, performance improvements will at best only match the rate 
of improvement due to further process technology innovations, which is projected to 
continue at 19% per year. While microprocessors have sustained performance 
improvements of 52%/year, fabrication technology has actually provided a much higher 
growth rate in potential capability. When accounting for increased transistor counts and 
faster transistor switching speeds, the capability of microprocessor-scale integrated 
circuits has been improving at 74%/year. Until now, the differential between the 74% 
and 52% rates has resulted in only a factor of 30 of untapped performance potential. 
However, with only 19% per year projected in the future, the differential is expected to 
increase to a factor of 30,000 by 2020. This quantity represents a tremendous 
opportunity for novel architectures to help bridge the performance gap and to enable 
future computer systems to solve increasingly complex and important problems. 
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EMBEDDED COMPUTING PERFORMANCE REGIONS 


FOM2.PPT, James C. Anderson, 6/5/00 

Upper diagonal line: 6U VME limit is 6A@5V & 1A@+/-12V = 54W per slot 
(6.3x9.2x0.8” slot-to-slot spacing) @60C, “System Packaging Products,” Carlo Gavazzi, 
Inc., Mupac Business Unit. SEM-E (MIL-STD-1389 & IEEE-Std-1 101.4-1993) is 12W 
conservative, 24W typ & 48W peak (5.88x6.41x0.6” slot-to-slot spacing). 

Projected requirements for processor subassemblies of selected DARPA 
projects: Space-Based Radar ca. 2008 (1100 GOPS, 20W, 0.05 cu ft), Uninhabited 
Aerial Vehicle ca. 2005 (710 GOPS, 400W, 2 cu ft), Soldier’s Radio for Small Unit 
Operations ca. 2002 (13 GFLOPS, 0.5W, 0.022 cu ft) 

Graybill Notes: This vg highlights the relative upper boundaries for the three 
original and the new PCA class of processing options in terms of Computational 
efficiency (GOPS/Watt) and Computational Density (MOPS/cmsq). There is a key third 
dimension that is not shown that would highlight the great variation in efficiency as a 
function of kernel types 

With the advent of Polymorphous Computing technology a new class of 
computing options will now be available for embedded computing. Polymorphous 
Computing offers almost the density of VLSI but with the programmability of 
conventional computers. In addition the architecture or virtual machine realization may 
be changed dynamically in response to mission processing/threat requirements. Agile 
virtual selection of computer types may now be done during the mission instead of 
locking into a pre-determined mix of processor types during platform development. 
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Novel Architectures are Needed 
to Extend Productivity 
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Embedded Computing 
Performance Regions 




Figure 14 
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Polymorphous Computing 
Architectures Program 



Goal : Computing systems (chips, networks, software) that 
will morph to changing missions, sensor configurations, 
and operational constraints during a mission or over the life 
of the platform. 
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Mission Aware Embedded Computing 



Figure 15 



Agenda 



Introduction 

- DoD System Challenges 

- Motivation for Cognitive Computing 
Technology Trends 

- Device Performance 

- High Performance Computing 
Cognitive Systems Vision 
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DARPA/IPTO COGNITIVE SYSTEMS 


The major focus for the newly formed IPTO is the development of cognitive 
systems. While computational performance has been increasing due to Moore’s Law, the 
productivity and effectiveness of these systems are not increasing at the same rates. 
Development of cognitive systems will enable systems to become more usable; more 
flexible in application; less vulnerable to attacks and more robust in detecting and 
recovering; all while remaining cost-effective. Cognitive systems are enabled by a firm 
foundation in the underlying science and mathematics in algorithms and information 
assurance, which is then embodied in robust software and executed on advanced 
hardware. Future IPTO programs will also focus on the critical aspects of autonomous 
perception; knowledge representation and reasoning; machine learning; and 
communications and interactions. 


• DARPA IPTO will create a new generation of 
cognitive computational and information 
systems with capability to: 

- reason, using substantial amounts of 
appropriately represented knowledge 

- learn from their experience so that they perform 
better over time 

- explain themselves and be told what to do 

- be aware of their own capabilities and reflect on 
their own behavior 

- respond robustly to surprise 



Cognitive Systems 


DARPA/IPTO 
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ANATOMY OF A COGNITIVE AGENTS 


This chart illustrates one possible architecture for realizing a cognitive 
computational system. This diagram shows the relationships and connectivity among the 
3 major processes that are usually associated with cognition. In addition, the relationship 
between these processes and the machine’s perception (sensors), action (effectors), and 
the environment. 



Figure 18 
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Initial Challenge Context 



Persistent, personal partner/associate systems 

• Learn from experience 

• Learn what you like and how you operate 

4 by observation 

4 by direct instruction or guidance, in a natural way 

• Imagine possible futures, anticipate problems and needs 

• Omnipresent / always available 

Examples 

• Commander’s (C2) assistant 

• (Intelligence) Analyst’s associate 

• Personal executive assistant/secretary 

• Disaster response captain’s “RAP” (robot/agent/person) 
team 
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Focal Challenge Context 



An Enduring, Personalized, Cognitive Assistant 



and, never failed... 
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A Cognitive System 




* Uses knowledge of the domain, task 

* Exhi bits purposeful perception 

* Can imagine possible futures 

* Can decide what to do (prioritize) and act 

* Learns, including by observing partner 

* Can be advised and guided, and can explain 

* Must know how to cooperate 

* Multi-modal, broad-spectrum interaction 

* Available everywhere - omnipresent 

* Must be trustworthy 

* Must learn continuously 

* Able to survive, operate through problems 


ENDURING, PERSONALIZED, COGNITIVE ASSISTANT 
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anflpnpj DARPA/IPTO Cognitive Systems 
Prospective Program Overview 



Single-Agent Cognition 

Individual systems that 
perceive, reason, learn, and 
respond robustly to surprise 


Collaborative 

Cognition 

• Teams of 
cognitive 
agents 
working 
together 



Enduring, 
persona I i zee 
cognitive 
assistant 


Collective^ 

Cognition 

• Collective 
intelligence 
from 

distributed 

behavioral 

elements 



Technology Programs] [ Integration Programs] 


Figure 23 


443 





Summary 



Cognitive Systems will be the Next Revolution in Computing 

Cognitive Systems know what they are doing: 

• They can reason 

• They can learn from their experience 

• They can explain themselves 

• They are aware of their own capabilities 

• They can respond robustly to surprise 

Send us your best ideas: 

http://www.darpa.mil/ipto/Solicitations/PIP 02-21 .html 

•Take a tour as a DARPA Program Manager 

rbrachman@darpa.mil (703) 696-2264 

zlemnios@darpa.mil (703) 696-2234 
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ASYMMETRIC ADVANTAGE ENABLED BY INFORMATION SUPERIORITY 


“Not long ago, a prime contractor was 
one that built a jet or ship. Now, these 
vehicles are simply “ platforms ” for 
sensors and information systems.” 

Washington Post; Analysis of NG bid for 
TRW, Feb. 23, 2002 

Asymmetric Advantage - Information superiority gives US forces an asymmetric 
advantage over our adversaries. This chart shows a possible future battle space 
environment with a representative threat and a number of individual high performance 
platforms and capabilities that can be brought to bear against all threats. In addition, 
these platforms will also be networked together to enable instant sharing of information. 
The network will close the sensor-to-shooter latency for the classes of dynamic targets 
that are difficult to detect by conventional means. Advances in electronics and in 
computation will enable us to see farther with greater clarity than our enemies could ever 
imagine. 
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INFORMATION NETWORKS HAVE REVOLUTIONIZED PLATFORMS 


Information Networks Will Revolutionize Platforms - Networks will also be used 
at the platform scale to revolutionize their capabilities. On the upside, these now 
networked systems will be more survivable, more lethal, more adaptable, and more agile 
than before. On the downside though is complexity. Not only are there issues relating to 
physical interconnections and signal routing, but there is also a massive growth in the 
number of lines of software code necessary for managing and exploiting the information 
on the network. Integration costs are also escalating. Actual hardware costs can be a 
fraction of the total, while software and integration/test are accounting for a major part of 
the “fly-away” cost of a platform. 


Information Networks have 
Revolutionize Platforms 


eras 



TODAY - Federated Architecture (Baselines 1-5) FUTURE - Distributed Processing (Future B/L) 

♦ Point-to-Point Mainframes (UYK-7/43) ♦ Highly Distributed Network 

♦ Limited Growth Capability ♦ Redundancy Plus Reconfigurability 

♦ Vulnerable to Damage ♦ Effectively Invulnerable to Battle Damage 

The Upside: Agile, Adaptable, Survivable Systems 
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The Downside: Complexity in h/w, s/w, integration 
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THE RESULT WILL ENABLE A REVOLUTION IN CAPABILITY FOR DOD 


The Result Will Enable a Revolution in Capability for DoD - Potential threats are 
increasing and human analysts are overwhelmed by sensor data. Adaptive and intelligent 
data-fused sensors will enhance system performance against a changing threat 
environment. Cognitive information exploitation will provide the knowledge and 
information from fused sensors. 


LI 



The Result Will Enable a 
Revolution in Capability for DoD 


More Aggressive 

Adaptive and Intelligent 

Threats 

Data-Fused Sensors 


Threats are more dynamic and in deeper hide (collapsing time lines) 
System performance is outpaced by changing threat environments 
Cooperative battle management requires robust information backbone 


Sensor Data Flow 
Overwhelming Human Analyst 


Cognitive Information 
Exploitation 


Sensor bandwidth is increasing faster than processor capability 
Target classification has become a multi sensor problem 




The next revolution in sensing: Autonomous Adaptation 
The next revolution in computing: Cognitive Processing 
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BEYOND CMOS: THE ROAD BEYOND BULK SILICON 
FIELD EFFECT TRANSISTORS 


This is an illustration of the technology roadmap for devices that might evolve 
from todays bulk CMOS. The graph show that even now, the measured and expected 
performance for deeply scaled transistors is starting to deviate from the extrapolations 
from the past. This deviation means that even if the devices continue to shrink in size 
and grow in number, the performance of microprocessors will not scale as it has in the 
past. MTO is sponsoring research to close this technology gap and to explore the classes 
of nanoscale devices that have interesting terminal properties and might be useful for 
electronics. To date, the work on carbon nanotubes and molecules with large 
conformational changes is starting to bear fruit in this area. 
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Combined Air Operations Center 
(CAOC) 



* CAOCs coordinate theater-of-war air 
Storm or Enduring Freedom 



operations, such as Desert 

1992 1599 2001 2002/cst. 



» Despite automation, 
staff/sortie doesn't shrink 

* People must adapt to 
computers, rather than the 
other way around 


Maximum 

Operation 
Desert 
Storm - Gulf 
War 

Operation 
Allied Force 
- Kosovo 

Operation 
enduring 
Freedom - 
Afghanistan 

Southwest 
Asia major- 
theater war 

sortie 

capabiiity 

2000+ 

800 

500 

1000 

Sorties per 
person 

. 8 

.3 

.7 

.6 
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Combined Air Operations Center 
(CAOC) 





Director MoWIty Forces 


Intelligence, 
end Reconnaissance 


Strategy 


Current CAOC Personnel Requirement Total: 828 


"age 31 


Prince Sultan Air Base 


Core bet Ops 


: Combined Air Operations Center Staff =10 

* Combat Operations =141 
Strategy =31 

Combat Plans =105 

■ Combined Air Operations Center Management =B 

* Communications Support =179 

■ Regional Air Movement Control Center =13 
Information Warfare =44 

□ Intelligence. Surveillance and Reconnaissance =229 
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EOT! 


A Cognitive System 




Radar O'Reilly 

- observed 

- anticipated 

- planned 

- worked autonomously 
(but supervised) 



Figure 32 



A Cognitive System 

<9 



Will have and use knowledge of the domain, task 

Exhibits purposeful perception: uses models of 

the world to guide 

Can imagine possible futures 

Can decide what to do (prioritize) and act in real 

time 


Radar O'Reilly 

- observed 

- anticipated 

- planned 

- worked autonomously 
(but supervised) 


COGNITIVE 
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A Cognitive System 




♦ Will have and use knowledge of the domain, 
task 

♦ Exhibits purposeful perception: uses models 
of the world to guide 

♦ Can imagine possible futures 

♦ Can decide what to do (prioritize) and act in 
real time 

f Teams' fnclu'di ng' by' observing' partner ' \ 

!* Can be advised and guided, and can explain ;■ 

: i| 

♦ J 


Radar O'Reilly 

- observed 

- anticipated 

- planned 

- worked autonomously 
(but supervised) 


PERSONALIZED , COGNITIVE 
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A Cognitive System 




Radar O'Reilly 

- observed 

- anticipated 

- planned 

- worked autonomously 
(but supervised) 


* Will have and use knowledge of the domain, 
task 

* Exhibits purposeful perception: uses models 
of the world to guide 

* Can imagine possible futures 

* Can decide what to do (prioritize) and act in 
real time 

* Learns, including by observing partner 

* Can be advised and guided t and can explain 

* Must know how to cooperafe (be a team player) 

s • Should be available everywhere - omnipresent J 
5 * Must be trustworthy 



PERSONALIZED , COGNITIVE ASSISTANT 
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A Cognitive System 




Radar O'Reilly 

- observed 

- anticipated 

- planned 

- worked autonomously 
(but supervised) 


♦ Will have and use knowledge of the domain, 
task 

♦ Exhibits purposeful perception: uses models 
of the world to guide 

♦ Can imagine possible futures 

♦ Can decide what to do (prioritize) and act in 
real time 

♦ Learns, including by observing partner 

♦ Can be advised and guided, and can explain 

♦ Must know how to cooperate (be a team 
player) 

♦ Uses multi-modal, broad-spectrum 
interaction 

♦ Should be available everywhere - 
omnipresent 

♦ Must be trustworthy 

♦ Must learn continuously 

' M u st'Be able to 's'urvive, "operate thro ugh" " ' 
problems 

i* 

* . ** 


ENDURING, PERSONALIZED, COGNITIVE ASSISTANT 
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Adaptive Networking 



■ Remove the burden of 
network management/ 
operations from CAOC staff 

■ Create new operational 
opportunities through flexible, 
symbolic control 


Technical Challenges 

Allow network to be aware of itself and be more 
responsible for its own adaptation 

Manage ever-growing complexity 
Allow network applications with comprehensive 
reach to peer into and leverage the operational 
network 

Create shared structural and behavioral models 

of network in operation 

Techniques to allow collective (distributed) 

cognition across multiple knowledge 

applications 



Technical Approach 


■ Create separate network overlay with 
explicit models and knowledge structures 
covering entire network ("knowledge 
plane") 

■ Separate algorithms, policies, goals, and 
general knowledge for easier update and 
to facilitate learning 

■ Apply learning mechanisms to allow 
network to adapt over time 

■ Add extra mechanisms to enhance security 
and privacy 
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Adaptive Networking 

Sample Application: "Why?" 


Fault management is illustrative of key issues in 
cognition and networking 


;now 


r 


K-Application • Why? 1 ’: 

Network fault detection isolation, and repeir 


K-Base „ 

Models _'3 

Inference 

Models of 

rules, 

Hemet structure, 

dagnoslc 

application 

procedures 

behavior, 


requrements 



The “Why?' application: 

- "Why?" explains and fixes network 
problems 

- Relevant data is represented, routed, 
and aggregated in knowledge plane 

- Information "features" are anatyzed 
using modern probabilistic models, 
inference engines 

- Actuation in better-than-human time 
How is it done today: 

- Ad hoc, out-of-band sharing of 
human-readable information between 
operators 

- Low-level tools like "ping" and 
"trace route" 

What's new: 

- Observations from muttiple vantage 
points 

- Collective action to resolve problem 

- Mixed-mode distributed teaming 

- Framework for privacy, security, and 
marketplaces of data 

- Endpoint participation and knowledge 
sharing 
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Collaborative Cognition 


1 Take advantage of the increased 
intellectual power of collaborative 
teams 

1 Demonstrate radical new 
capability for military operations 
with focus on goal-directed 
teamwork among military 
operators, software agents, robots 



Remote Agents 

Teammates 


Technical Challenges 

"Team -cognition" system architectures that 
improve overall capabilities 
Task performance gains for team pairs 
employing mutual core capabilities to sense, 
act, and move as an integrated team 
Natural, "high-bandwidth" communication 
between team members at multiple levels of 
abstraction 

Ability for team to function well even when 
running asynchronously 


Technical Approach 

■ Develop methods for rapid development 
and deployment of heterogeneous intelligent 
entities in complex environments 

■ Enable agents to monitor, assess, and 
explain environmental situations 

■ Use plug and play modules for cognitive processes 

• Allow agents to have models of other agents 

• Facilitate agent learning to decrease time 
and effort to create agents and increase 
reusability across applications 

• Support "selective autonomy" 
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Collaborative Cognition 



Creating systems that are capable of collaborating 
with each other, as well as with humans 



■ Teams will cooperate and 
adapt as situations dictate 

* Communications will take 
advantage of a net-centric 
environment in an essential 
way, using wireless 
interaction for natural 
language and data 

■ Collaborative systems will 
create a true force multiplier 

■ Heterogeneous reasoning will unlock key aspects of the problem 

■ Recent studies have shown that groups can solve problems 
that individuals cannot 

■ Ultimate payoff will be adding synthetic agents as team members 

■ Robots can replace humans in high-risk situations 

- Use same software in simulation and operational systems 
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Localize 


mid 


Self-Aware Peer-to-Peer Networks £? 


CONORS: 

UAV identifies areas of interest 

Expendable UAV extends CVBG Surveillance range and delivers UGS 
Robotic sensors detect, localize, identify and target threats 
UAV interfaces to C4I systems on CVBG 

UUV surveys coastline and determines best insertion point for UGVs 
Commander provides launch "authorization" 




Multiple UAV Control 
Multiple Payload Control 
Dissimilar Vehicles 
Airborne TCS 


Air/Sub Launched 
Expendable UAV 
EO Sensor Payload 
UGS Delivery 


UAV Launch with sensor payload 


* • Autonomous 
-Gerrtfle 


Unmanned 

Undersea 

Vehicle 
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j Self-Aware Peer-to-Peer Networks 



Resilient, scalable, self-diagnosing and 
self-healing networks 
Operate in resource-constrained 
environments 

Components remain aware whether 
linked or disconnected 



Key Technical Goals: 

■ Representations for network state, 
observable/sensed conditions, and capabilities of 
network components 

• Reasoning regimes to allow components to be self- 
aware, understand overall mission 

• Peer-to-peer ability to share knowledge about 
network state and optimal policies for 
troubleshooting 

■ Incorporation of remedial actions into the model 

■ Continuous learning by sensing network conditions 
on a regular basis and adapting recommended 
remediation policies 
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Build Cognitively Aware Computational 
Systems to Enhance Human & Computer 
Performance 

AugCog will enable computational systems to dynamicall 
adapt to users by developing the means to: 

Phase 1: Measure cognitive state 
Phase 2: Manipulate cognitive state non-invasivelv 
Phase 3: Automate cognitive state manipulation 
Phase 4: Operational demonstration and test 


Technical Challenges 


Technical Approach 


Demonstrate Manipulation of Cognitive State 
in less than 1 minute, with no performance 
degradation 

Overcome the bottlenecks in cognitive 
processing to enable the manipulation of 
cognitive state 

Develop whole head imaging system with an 
emphasis on miniaturization and 
ruggedization of current device designs 


■ Enhance attention management via a 
directed attention and autonomous task 
delegation strategy 

• Maximize executive functioning and 
facilitate memory enhancement via an 
automatic cued retrieval strategy 

■ Exploit multiple sensory channels via an 
autonomous information delivery strategy 
to multiple modalities 

■ Maximize working memory processes via 
an autonomous intelligent interruption 
and negotiation strategy 
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m Augmented Cognition (AugCog) 

Phase 1 Technical Accomplishments 



• Phase 1 Objective FY02-03 : Assessed cognitive state in 
real-time with minimal or no interference 

Achieved through the development of robust, non-invasive, 
real-time cognitive state detection technology. 

* Phase 1 Successes: 

Cognition Monitoring system detected cognitive state 
shift (Verbal to Spatial! in <1min using EEG signals 

Cognitively designed information system made it 
easier for people to encode, store, and retrieve 
information - Results indicate that the users 
demonstrated a 131% improvement in memory 

Demonstrated 380% increase in accuracy and a >3x 
reduction in operator stress on a complex task two 
sources of interruption 
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7 Augmented Cognition (AugCog) 

Phase 2 



Objective: Cognitive state manipulation in less than 1 
minute with no degradation in performance. 


Input Cognitive Output 



AugCog Phase 2: 
Manipulation Strategies 

Sequential Processing 

targets the Working Memory 
bottleneck 

Cued Memory Retrieval 

targets the Executive Function 
bottleneck 

Multimodal Systems 

target the Sensory Input 
bottleneck 


Attention Management 

targets the Attention bottleneck 


Breaking the bottlenecks in cognitive processing to enable the 
manipulation of cognitive state 


Figure 45 
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frpPFfr. Augmented Cognition (AugCog) 

J , ^ Brain Regions & Cognitive Bottlenecks 


Verbal Working 
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patial Working 
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Executive 
Function (fNIR) 
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High Productivity Computing 
Systems (HPCS) 



Majority of new installed 
capability systems are 
commodity computing based 

Today's 


Vision 

Provide a new generation of 
economically viable high productivity 
computing systems for the national 
security and industrial user 
community (2007- 2010) 



late 80's HPC technology 


Rll the Critical Technology and 
Capability Gap 

Extensibility of Moore's Law 
into the next decade 


Address critical national security 
and industrial application 
requirements 

2010 


High-end Computing Solutions 


Technical Challenge 

Balanced scalable system architecture 
High effective bandwidth (on/off chip) 

System robustness or brittleness 

Performance monitoring, measurement and prediction 

System “design point" tailorability 


Commodity High 
Performance 
Computers have 
demonstrated 
inefficiency 
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